Inverted Index

Question (LI.500)

Given a list of documents with id and content, return a HashMap with key as the word and value as a list of document ids (an inverted index).

Example

Input: 
[
  {
    "id": 1,
    "content": "This is the content of document 1 it is very short"
  },
  {
    "id": 2,
    "content": "This is the content of document 2 it is very long bilabial bilabial heheh hahaha ..."
  },
]

Output: 
{
   "This": [1, 2],
   "is": [1, 2],
   ...
}

Code

Reference

  • Inverted Index

  • TF-IDF

  • SVD

Last updated