Saturday, January 18, 2014

Which morphologically-tagged Hebrew Bible is most accurate?

Technical post alert: While looking around for a free on-line Hebrew Bible concordance to recommend to my Hebrew students, I noticed that different electronic concordances produce different results. Here, for example, is what you get if you search for the Hebrew verb אָהַב ("to love"):

Online sources that use Strong's KJV:
209 results in 195 verses (Biblearc.com; Crosswire's Bible Tool)
208 results in 195 verses (Blue Letter Bible)

The Grove's Center's Westminster Hebrew Morphology
215 results in 200 verses (WTM 4.2, 4.14, 4.18 in Bibleworks* and Logos)
(*Bibleworks originally gave me 217 results in 200 verses because I had selected both Ketiv and Qere readings.)

Other Logos Tagged Texts
BHS/WIVU (Werkgroep Informatica): 210 results in 197 verses
Lexham Hebrew-English Interlinear: 210 results in 196 verses

Observations:

  1. I assume that the Westminster Hebrew Morphology is the most accurate. 
  2. All the online sources I have tried that give frequency information are based on Strong's concordance, which is too bad for anyone looking for decent quality, free resources. (I'm disappointed that Biblearc.com fares no better, as its Greek New Testament resources are outstanding.)
  3. There is currently no online resource that provides full parsing information for the Hebrew Bible. 
  4. Tyndale House's Step Bible may be out to change 2. and 3., but it is not there yet. (A search for אָהַב gave me 196 verses, but no occurrence list.) 
  5. We need to remember that our digital tools still have mistakes.
  6. Please let me know if there are other resources I should be looking at.
Update: Andy from Biblearc emailed (back in January!) to let me know that the discrepancy between Biblearc and other tagged databases of the Hebrew Bible can be explained:
I would challenge the conclusion you make in your blog post that the lemma data we use for the Hebrew searches is not "decent quality". I took a look at the example you give of אהב in your blog post for instance. The verses where the lemma data differs are found in the following verses:Gen 29:20; 1 King 10:9, 11:2; 2Chr 2:10; 2Chr 9:8; Hos 3:1, 9:15; Mic 6:8.

Taking a look at these, you will discover that the issue is whether to take the appearance of "אהבת + noun" as an infinitive construct of the verb "to love" (אהב) with object or as the noun "love" (אהבה) in construct form. In most all (if not all) of these cases an argument could be made for both of the options as there is no difference in how they are written. Perhaps there is another example besides אהב where the discrepancy is more significant, but if not I would encourage you to reconsider your judgment of the Hebrew lemma data that we use.

11 comments:

Unknown said...

Hello Dr. Miller,
Im a Hebrew lover and doing my intermediate Hebrew course this semester after doing the beginner one last term.

I've been fervently trying to find a free online Hebrew Bible with full parsing, but to no avail. I think you are right there isn't one currently, which is a real shame as there are abundant resources with parsing for Greek.

Would you please let me know when a free one is available in the future?

d. miller said...

I will, if I remember. Don't hold your breath, however. The Hebrew Bible is a much larger corpus than the NT and nobody seems to want to make their Hebrew parsing work available for free.

Unknown said...

Thanks a lot!
My final resort would be the Logos biblical language pack, but the price is quite dear for a student even with a discount...

d. miller said...

Have you considered Bibleworks? If it is biblical language tools that you want, Bibleworks gives you more Greek and Hebrew resources for less than a comparable Logos package (although Logos is also a very fine program). Accordance is also worth considering.

d. miller said...

Hello again, jfamily 07: If you are just interested in parsing (not searching), I recommend OliveTree's parsed BHS when it comes on sale for 50% off (about $40). Right now it is on sale for $50, which is not too bad: https://www.olivetree.com/store/product.php?productid=17381.

OliveTree works great on most computing devices (including Android phones), but unfortunately it is not possible to perform morphological searches on the Hebrew text, afaik.

Unknown said...

Oh thank you for remembering my request out of all your busyness~~!!! yes, I did see that site already.. much cheaper than logos... yes, i guess i do need to invest a few bucks :-)

in the meanwhile, you would probably know http://www.blueletterbible.org/
it parses the verbs for you up to stem and tense, which is a great help! e.g. Qal perfect

Thanks!
Annie

d. miller said...

I didn't know that, Annie. Thanks. I've looked around on blueletterbible before, but never figured it out.

Unknown said...

I'm a programmer, and I'm interested in writing an app that will provide parsing data for texts, and would also allow people to perform morphological searches. I found https://github.com/morphgnt/, which provides some raw data for the Greek New Testament. Do you know of anything comparable for the Old Testament text? If I can get my hands on some morphological data and a good Hebrew text, I think I can combine it into something that would be useful for people.

d. miller said...

Hi Unknown,
The problem is that, as far as I know, there is no tagged Hebrew Bible database that is freely available online. The Grove's center's is probably the most common, but they charge for it. The main reason for the difference between Greek and Hebrew texts, presumably, is that the Hebrew Bible is much longer and more complicated than the NT.

Unknown said...

I apparently can't set my name when I sign in with my Google Account. I'm Michael Stalker. Are there many instances where a word with vowel points could mean several things? That occasionally happens in Greek, but I'm not sure how often it happens in Hebrew. As a follow-up, would getting a KJV tagged with Strong's numbers, and trying to tag Hebrew words based on that, be a waste of time? I don't know how accurate Strong's is.

d. miller said...

Hi Michael,
To answer your questions: Consonantal homonyms are pretty common in Hebrew, and there are already quite a few tagged Strong's texts available. The main thing that is missing is freely available texts that are tagged for parsing.

Best,

David