Research discovery

Has someone answered the question you have not worked out how to ask yet?

Recommender systems for academics are hard and in particular, I suspect they are harder than normal because definitionally the content should be new.

Could a normal recommender system such as canopy be made to work for academics?

unpaywall and oadoi seem to be indices of non-paywalled preprints of paywalled articles. oadoi is a website, unpaywall is a browser extension.

  • arxiv sanity

    Aims to prioritise the arxiv paper-publishing firehose so that you can discover papers of relevance to your own interests, at least if those interests are in machine learning.

    Arxiv Sanity Preserver

    Built by [@karpathy]( to accelerate research. Serving last 26179 papers from cs.[CV|CL|LG|AI|NE]/stat.ML

    Includes twitter-hype sorting, TF-IDF clustering, and other such basic but important baby steps towards web2.0 style information consumption.

    The servers are overloaded of late, possibly because of the unfavourable scaling of all the SVMs that it uses, or the continued growth of Arxiv, or epidemic addiction to intermittent variable rewards amongst machine learning reseachers. You could run your own installation – it is open source – but the download and processing requirements are prohibitive. Arxiv is big and fast.

  • groundai Aims to affect both discovery and publishing, by providing community peer review.

    The potential of Community Peer Review Commenting will provide a way for researchers to ask for feedback about their work, then incorporating this feedback into revisions and to generate new ideas.

    Making this feedback openly accessible to everyone can help increase the public’s understanding and trust of scientific work and increase transparency.

    Having community support and a dialogue atmosphere inspire ideas to flow and be explored freely through insightful questions. In dialogue, people think together.

    Preprints discussions usually happen on twitter and facebook, but these comments are not housed with the preprint. We believe having the opportunity to provide feedback that is stored directly with the preprint will increase transparency and collaboration at all stages of the scientific process. We hope to see the dialogue becomes a part of the scholarly record.

    t;dr This renders papers in a friendly format for public annotation and links to related ones and supporting data etc easily.

  • Papers with code

    The mission of Papers With Code is to create a free and open resource with Machine Learning papers, code and evaluation tables.

    We believe this is best done together with the community and powered by automation.

    We’ve already automated the linking of code to papers, and we are now working on automating the extraction of evaluation metrics from papers.

  • gitxiv (source)

    In recent years, a highly interesting pattern has emerged: Computer scientists release new research findings on arXiv and just days later, developers release an open-source implementation on GitHub. This pattern is immensely powerful.

    GitXiv is a space to share links to open computer science projects. Countless Github and arXiv links are floating around the web. It’s hard to keep track of these gems. GitXiv attempts to solve this problem by offering a collaboratively curated feed of projects. Each project is conveniently presented as arXiv + Github + Links + Discussion. Members can submit their findings and let the community rank and discuss it. A regular newsletter makes it easy to stay up-to-date on recent advancements. It’s free and open.

    In terms of things that I will actually use, this source-code requirement idea is good. However, the site itself is no longer maintained and has fallen into disrepair.

    Perhaps they are superceded by…

  • papers with code, which is similar

  • trendingarxiv (source):

    Keep track of arXiv papers and the tweet mini-commentaries that your friends are discussing on Twitter.

    Because somehow some researchers have time for twitter and the opinions of such multitasking prodigies are probably worthy of note. That is sadly beyond my own modest capacities. Anyway, great hack, good luck.

Paper anlysis/annotation

Baldur Bjarnason, Neither Paper Nor Digital Does Active Reading Well:

Catching up on usability research throughout the years makes you want to smash your laptop agains the wall in anger. And trying to fill out forms online makes you scream ‘it doesn’t have to be this way!’ at the top of your lungs. The same applies to reading software. When you read up on research and papers on skills development, memory formation, and active reading, frustration with existing tools inevitably follows.

At least with paper, we can teach people to hack their tools—extend the printed book with post-its, commonplace books, bookmarks, and inline annotation. Doing the same in digital is incredibly hard without programming skills or expensive tools, even when the closed silos allow it.

The cognitive effort to actively and intelligently read a text in depth is, if not equal to, then on the same order of magnitude as the effort to write about a complex subject.

But we only have full-featured tools to help us with writing. Ulysses, Tinderbox, Scrivener etc all make managing and writing a complex writing task much easier. Even code-oriented text editing workflows with their steeper learning curves are a major improvement over paper-based writing workflows.

We can also use paper-based writing tactics in tandem with the digital ones, to the point of going back and forth between the two. You can’t do the same easily with reading.

Which leads us to the current situation: our ability to handle complex writing tasks is increasing while our default reading toolset is stagnating at best.

He ends up giving an extensive advertisement for liquidtext, an ipad app with actual UI development, which looks nice. If you have an ipad.

Select text to annotate. Add tags and post publicly or save privately.

Reply to or share any annotation. Link to notes or whole pages.

Annotate together in groups. Collaborate privately with others.

Search your notes. Explore all public annotations and profiles.

The have documented a recommended workflow.

pdfx (source) claims to:

  • Extract references and metadata from a given PDF
  • Detect pdf, url, arxiv and doi references
  • Fast, parallel download of all referenced PDFs
  • Output as text or JSON (using the -j flag)
  • Extract the PDF text (using the --text flag)
  • Use as command-line tool or Python package
  • Works with local and online pdfs

Fermat’s Librarian is

A Chrome extension that enhances arXiv papers. Get direct links to references, BibTeX extraction and comments on all arXiv papers.