What I built:

Gatherscope was everyone’s Presidential Daily Brief, organizing and summarizing the world’s news so that busy readers could make sense of current events quickly. It was written mainly in Python and consisted of:

  • Milli – content crawling service
  • Sortinghat – content classification and clustering service
  • Distillery – summarization service

Ultimately, there was an unsustainable gap between the cost of productionizing state-of-the-art natural language processing in a news context and consumers’ willingness to pay for it.

What I learned:

  • Lots about neural networks:
    • recurrent neural networks
    • generative adversarial networks
    • reinforcement learning
  • Lots about natural language processing:
    • language models
    • document clustering
    • machine translation
    • machine summarization