Building A Python-Based Search Engine
- Type:
- Talk
- Audience level:
- Experienced
- Category:
- Other
March 11th 1:30 p.m. – 2:10 p.m.
Description
Search is an increasingly common request in all types of applications as the amount of data all of us deal with continues to grow. The technology/architecture behind search engines is wildly different from what
many developers expect. This talk will give a solid grounding in the fundamentals of providing search using Python to flesh out these concepts in a simple library.
Abstract
- Core concepts
- Terminology
- Document-based
- Show basic starting code for a document
- Inverted Index
- Show a simple inverted index class
- Stemming
- N-gram
- Show a tokenizer/n-gram processor
- Fields
- Show a document handler which ties it all together
- Searching
- Show a simple searcher (& the whole thing working together)
- Faceting (likely no demo)
- Boost (likely no demo)
- More Like This
- Wrap up
- Point to the GitHub repo for the sample code