BibSON uses RSON (See https://code.google.com/p/rson/wiki/Manual and https://pypi.python.org/pypi/rsonlite/0.1.0) with special keywords for bibliographic data.
RSON syntax is relaxed compared to JSON:
Any valid JSON file is also a valid RSON file as long as it is encoded in UTF-8 or ASCII. (External conversion functions could detect other files and pass UTF-8 or Unicode to the RSON decoder.) Comments are allowed with a leading # (in any column) if the comment is the only non-whitespace on the line. (But inside a triple-quoted or equal-delimited string, the # may not always start a comment—it may be part of the string.) String quoting is not required unless the string contains any RSON special characters. RSON special characters are: { } [ ] : , " = (The same as the JSON special characters, plus =.) Python-style triple-quoted (""") strings are supported for practically arbitrary embedded data. Integer formats include hex, binary, and octal, and embedded underscores are allowed. In addition to the relaxed JSON syntax, RSON supports Python-style indentation to create structures. When inside any or {} pair, the syntax is JSON syntax, with the enhancements described above. Outside any or {} pairs, RSON indented syntax is used to describe the structure.
RSON indentation controls the nesting levels of dicts or lists. As with Python, spaces or tabs may be used (in fact, any valid JSON whitespace except \n or \r may be used for any whitespace, including indentation), but RSON does not make any equivalence between tabs and spaces, so indenting a line the same as or more than a previous line requires prefixing the line with exactly the same whitespace characters as were used on the previous line. Mixing tab and space (or other) indentation whitespace will most likely result in an error.
BibSON makes it easy to ...
@_@ I just found a very similar (much more advanced) project! See http://scholar.berkeley.edu/pitman/software/bibjson and http://okfnlabs.org/bibjson/ ...
Although I have to say, BibSON is still much easier to read and type :D... I should consider writing parsers from RIS, BibTeX, etc. to BibSON, and to formalize my specification.
And some more: http://blog.martinfenner.org/2013/07/30/citeproc-yaml-for-bibliographies/ http://blog.martinfenner.org/2013/08/04/automatically-list-all-your-publications-in-your-blog/ https://github.com/inukshuk/jekyll-scholar
Page numbers automatically convert xx--xx into xx–xx, and –,—, and & are turned into the appropriate html syntax.
Title = Latent variable modeling of hippocampal replay
#Type = [poster, cpaper, jpaper, talk, chapter, review, thesis, preprint, other]
Type = cpaper
AuthorList
First = Etienne
Last = Ackermann
ORCID = orcid.org/0000-0001-7139-9360
Bold = TRUE
First = [First, Middle]
Last = Last
# Date = YYYYMMDD; use YYYY0000 for just the year;
Date = YYYYMMDD
ConfDates = October 17--21, 2015
Pages =
DOI =
URL =
ExternLink = docs/posters/utaustin2015.pdf
PosterImg = images/poster-thumbs/SfN15.png
Abstract =
TODO:
write specification
expand citeString for conference, preprint, etc., as well as additional, custom labels in a list...
write syntax highlighting for Sublime Text: http://sublimetext.info/docs/en/extensibility/syntaxdefs.html
In [46]:
import rsonlite as rsl
import io
import contextlib
mybibfile = 'kemerelab.bibson'
#mybibfile = 'pubs.bibson'
with open(mybibfile, 'r') as f:
bibson = f.read()
mybib = rsl.simpleparse(bibson)
print('{0} entries read from "{1}"'.format(len(mybib),mybibfile))
In [48]:
pubTypes = {'jpaper': 'journal paper', 'poster': 'poster', 'other': 'other', 'cpaper': 'conference paper', 'talk': 'talk', 'preprint': 'preprint', 'thesis': 'thesis', 'chapter': 'book chapter', 'review': 'review' }
pubLabels = {'jpaper': 'journal', 'poster': 'poster', 'other': 'other', 'cpaper': 'conference', 'talk': 'talk', 'preprint': 'preprint', 'thesis': 'thesis', 'chapter': 'chapter', 'review': 'review' }
months = {'MM': '', '00': '', '01': 'January', '02': 'February', '03': 'March', '04': 'April', '05': 'May', '06': 'June', '07': 'July', '08': 'August', '09': 'September', '10': 'October', '11': 'November', '12': 'December'}
htmlwriter = io.StringIO()
for ii, pubitem in enumerate(mybib):
warninglist = []
pubTitle = pubitem.get('Title','')
if pubTitle == '':
warninglist.append('Title is empty in ' + 'pub: ' + str(ii+1))
AuthorList = pubitem.get('AuthorList',[])
pubType = pubitem.get('Type','other')
pubDate = pubitem.get('Date','YYYYMMDD')
pubYear = pubDate[0:4]
pubMonth = pubDate[4:6]
if pubDate == 'YYYYMMDD' or len(pubDate)!=8:
warninglist.append('Date not specified, or incorect format! Use YYYYMMDD in ' + 'pub: ' + str(ii+1))
num_authors = len(AuthorList)
pubExternLink = pubitem.get('ExternLink','#')
pubURL = pubitem.get('URL','#')
pubAbstract = pubitem.get('Abstract','')
pubAbstract = pubAbstract.replace("–", "--")
pubAbstract = pubAbstract.replace("—", "---")
pubAbstract = pubAbstract.replace("&", "&")
pubAbstract = pubAbstract.replace("---", "—")
pubAbstract = pubAbstract.replace("--", "–")
pubPosterImg = pubitem.get('PosterImg')
if num_authors == 0:
warninglist.append('AuthorList is empty in ' + 'pub: ' + str(ii+1))
#print('number of authors: {0}'.format(num_authors))
authorliststring = ''
if not isinstance(AuthorList, list): # only one author
#print('Only one author!')
authorstring = ''
if isinstance(AuthorList['First'], list): # list of names given
#print('with mutliple names!')
for name in AuthorList['First']:
authorstring = authorstring + name.capitalize()[0] + '. '
else: # only one name given
authorstring = authorstring + author['First'].capitalize()[0] + '. '
authorliststring = authorstring + AuthorList['Last']
#print(authorliststring)
else:
for ii, author in enumerate(AuthorList):
authorstring = ''
if isinstance(author['First'], list): # list of names given
for name in author['First']:
authorstring = authorstring + name.capitalize()[0] + '. '
else: # only one name given
authorstring = authorstring + author['First'].capitalize()[0] + '. '
authorstring = authorstring + author['Last']
if ii == num_authors-1:
authorliststring = authorliststring + ' and ' + authorstring
elif ii > 0:
authorliststring = authorliststring + ', ' + authorstring
else:
authorliststring = authorstring
#print(authorliststring)
pubConfName = pubitem.get('ConfName')
pubConfDates = pubitem.get('ConfDates')
pubGeneric = pubitem.get('Generic')
pubJournal = pubitem.get('Journal')
pubVolume = pubitem.get('Volume')
pubIssue = pubitem.get('Issue')
pubNumber = pubitem.get('Number')
pubPages = pubitem.get('Pages')
pubDOI = pubitem.get('DOI')
citeString = ''
if pubGeneric:
citeString += pubGeneric
if pubConfName:
citeString += '<i>' + pubConfName + '</i>'
if pubJournal:
citeString += '<i>' + pubJournal + '</i>'
if pubVolume:
citeString += ', vol. ' + pubVolume
if pubNumber:
citeString += ', no. ' + pubNumber
if pubIssue:
citeString += ', issue ' + pubIssue
if pubPages:
pubPages = pubPages.replace("--", "–")
citeString += ', pp. ' + pubPages
if pubJournal:
citeString += ', ' + months[pubMonth] + ' ' + pubYear + '.'
elif pubConfDates:
citeString += ', ' + pubConfDates
# write html snippet:
with contextlib.redirect_stdout(htmlwriter):
htmlsnippet = ('<div class="item mix ' + pubType + '" data-year="' + pubDate + '"><div class="pubmain"><div class="pubassets"> <a href="' + pubExternLink + '" class="tooltips" title="Download" target="_blank"></a><i class="icon-cloud-download"></i><a href="' + pubURL + '" class="tooltips" title="Link" target="_blank"></a></div><h4 class="pubtitle">' + pubTitle + '</h4><div class="pubauthor">''' + authorliststring + '</div><div class="pubcite"><span class="label label-' + pubLabels[pubType] + '">' + pubTypes[pubType] + '</span><span class="label label-year">' + pubYear + '</span>' + citeString + '</div></div><div class="pubdetails">')
print(htmlsnippet, end='')
if pubPosterImg:
htmlsnippet = '<a href="' + pubExternLink + '" target="_blank"><img alt="image" src="' + pubPosterImg + '" align="left" style="padding:0 15px 15px 0; width: 180px;"></a>'
print(htmlsnippet, end='')
htmlsnippet = '<h4>Abstract</h4><p>' + pubAbstract + '</p></div></div>'
print(htmlsnippet, end='')
print('Finished parsing ' + pubTypes[pubType] + ' "' + pubTitle[0:40] + '..." ' + '(' + pubYear + ')' )
if warninglist:
with contextlib.redirect_stdout(sys.stderr):
for warning in warninglist:
print(warning)
with open("pubs-pre.html", 'r') as f:
htmlpre = f.read()
with open("pubs-post.html", 'r') as f:
htmlpost = f.read()
with open("pubs.html", 'w') as f:
f.write(htmlpre)
f.write(htmlwriter.getvalue())
f.write(htmlpost)