Software for creating an index? « Digital Humanities Questions & Answers

Digital Humanities Questions & Answers » Applications, Tools, Formats

Software for creating an index?

(5 posts) (5 voices)

Asked 8 years ago by neuman
Latest answer from Dorothea Salo
This question has a best answer.

Tags:

neuman
Member
The MS is at the publisher, and now we must create an index. What concordancing or text-analysis SW for a PC simplifies the tax?
Tweet this question
Posted 8 years ago Permalink
parezcoydigo
Member

I recently indexed my book, but still largely by hand. I used DEVONthink to make a concordance as a first round of category identification, and then post index to search for terms I had left out. But, I still did the grunt work while reading through the proofs. I wrote about it here.

Posted 8 years ago Permalink
klarlied
Member

As I am sure you know, and index in not a concordance. My experience is that the latter does not even help much getting to an index. MS Word has a tool that allows you to mark words to be included in an index, and then creates an index, knowing the page numbers. Even so, you have to read through the manuscript, and decide what words need to be included (specifically in context - you may have a section on dogs that should be included, but not every time you use the word). Word does not really know about sections if you discussion of dogs runs from p. 14-18, but the word 'dog' does not happen to occur on p. 16, word will not be able to help.
At this point you have a finished ms. but if you decided that you needed to delete a paragraph, it could screw everything up. I had to write a macro to reduce by one all references past a specified page, for this very reason. I could have just re-run the indexer, but then my 'section' tweaking would all be wasted.
It is a huge pain. That's life.

Posted 8 years ago Permalink

wallygva
Member

http://www.billposer.org/Linguistics/Computation/LectureNotes/Concordances.html

This is a site that gives you some ideas for building indexes if you have a few unix shell skills.

Here's a short script that will index a text file by line numbers. I'm sure it could be enhanced but it's probably useful in even this rudimentary form:

 
#!/bin/sh
cat $1 |
tr -dc '[A-Za-z][:blank:]12' |
awk '
 {for (i = 1; i <= NF; i++) words[$i] = words[$i]  sprintf(", %d",NR);}
 END{for (i in words){
lines = words[i];
 sub(/^,/,"",lines);
 printf("%s\t%s\n",i,lines);
 }
  }' | sort -f -k 1 > $1.index

if you save this as a script called "makeindex.sh" you'd type:
makeindex.sh FileToIndex [return]

and you'd get FileToIndex.index when it finished running (typically a second or two).

For example, indexing the first few lines of the previous reply from klarlied you'd get something like this excerpt:

a 1, 2, 4
able 5
about 4
allows 2
am 1
an 1, 2, 2
and 1, 2, 3
As 1
be 2, 3, 4, 5
but 4, 5

A few other links of possible interest:

http://www.textanalysis.info/inforet.htm

http://www.asindexing.org/i4a/pages/index.cfm?pageid=3319

p.s., I agree with an earlier response that suggested DevonThink as a great tool for creating concordances. Unfortunately for you, it only exists on the Mac platform.

Posted 8 years ago Permalink

Dorothea Salo
Member
Best Answer

I reiterate klarlied above: an index is not a concordance!

True indexing software includes Cindex and SI7. If you need a refresher on why an index is not a concordance, and how to do an index right, I recommend Nancy C. Mulvany's Indexing Books.

Posted 8 years ago Permalink

RSS feed for this topic

Reply

You must log in to post.