Automatically Preparing Edge/Node Data for Gephi « Digital Humanities Questions & Answers

Digital Humanities Questions & Answers » Applications, Tools, Formats

Automatically Preparing Edge/Node Data for Gephi

(7 posts) (3 voices)

Asked 6 years ago by Ryan Cordell
Latest answer from Michael Widner
This question has a best answer.

Tags:

Ryan Cordell
Member
Okay,

I've done some work with Gephi lately, but I find myself with a problem I can't quite solve. I work on reprinting networks, and thus far have generated network graphs from spreadsheets of reprinting with the original newspaper in one column (source) and reprinting newspaper in the second (target). Import edge table-->Gephi creates a pretty graph.

I now have a much larger spreadsheet generated from a text-mining experiment I've started with a colleague in computer science. This spreadsheet includes for each found text:

an ID number identifying a particular reprinted text (ex: 8679:5136:18458:8488:5042:872:3924:2547:21444) | Date of each reprinting | URL of source text | Name of each publication | City and State of Publication | Longitude of Publication | the text matched

So there might be 10 lines with the same ID number--the "same text"--but different values in the other columns for each new reprinting of that text we found. I want to generate two opposite but complementary graphs from this data:

1.) in the first, the nodes would be Newspaper titles, and the edges would represent shared reprints--the ID field, I suppose. In other words, edges would be drawn between papers that reprinted the same text. Edges would be larger the more texts the two shared. I suspect there will be multi-stage process to prepare my data to do this, but I'm honestly not sure where to start.

2.) in the second, the nodes would be individual reprinted texts (the ID field for now, though we're working on generating titles) and the edges would be publications. Edges would be drawn between texts that appeared in the same newspaper.

Any help you can offer would be appreciated. I can't find a way to do this in one step through Gephi, so I'm sure there's some data massaging ahead of me.
Tweet this question
Posted 6 years ago Permalink
Scott Weingart
Member
Best Answer

I'm not completely clear what you're looking for by the descriptions, so let me try to re-word it. Do you mean you're looking for how pairs of reprinted texts co-occur based on which publications they share? And then you're looking for how pairs of newspaper titles connect to one another, based on which texts they share? So, two sides of the same network?

If that's the case, the first thing you should do is create a bimodal network. That is, every edge goes from a newspaper title to a reprinted text. You can then follow Shawn's steps here: http://electricarchaeology.ca/2012/04/04/converting-2-mode-with-multimodal-plugin-for-gephi/ to create text-text networks or newspaper-newspaper networks.

Posted 6 years ago Permalink
Ryan Cordell
Member

Right, Scott, two sides of the same network: one with texts themselves as nodes and the other with newspaper titles as nodes. What I'm asking for help on is your recommendation: "create a bimodal network." Should I import my spreadsheet into Gephi as an edge graph, with the IDs as source and the Newspaper titles as target, and then use the plugin Shawn references to convert that graph to a 1-mode network?

Posted 6 years ago Permalink
Scott Weingart
Member
Best Answer

So, this is a slightly more complicated problem than it ought to be. Instead of importing a network as an edge list, you have to import your data as separate node and edge lists, as described here: https://gephi.org/users/supported-graph-formats/spreadsheet/

In the node list, you'll need to add node attributes (an extra column) that labels the 'type' of the node; whether it is a newspaper title or an ID. Once that network is loaded, you should be able to follow Shawn's steps.

Posted 6 years ago Permalink
Ryan Cordell
Member

Thanks so much for your help on this, Scott. The plugin method seems to have worked, though I still need to clean up the resulting graph:

https://dl.dropbox.com/u/492930/Gephi%200.8.1%20beta%20-%20ChronAm-3000-2.gephi.png

Posted 6 years ago Permalink
Scott Weingart
Member

Great to hear it, looking forward to the results.

Posted 6 years ago Permalink
Michael Widner
Member

A quick tip for anyone doing similar work. The Python library NetworkX (http://networkx.github.com/) makes it very easy to create graph files that Gephi (and other programs) can read. For example, it will output a GEXF for you from nodes and edges that you create programmatically. You can set the colors, size, and other attributes so that you can have your data formatted for display in Gephi without all the manual work that sometimes requires.

Posted 5 years ago Permalink

RSS feed for this topic

Reply

You must log in to post.