NLP over large text datasets requires dealing with gigabytes and terabytes of text data. Tasks such as text import, storage, cleaning and transformation all need to be managed reliably yet in a way that does not impede text analysis. Most importantly all processing needs to be efficient and scalable.
A relational database provides a efficient and robust environment for large scale data management, while SQL provides flexibility and convenience for data analysis and organization.
Yet, until now, it was difficult to process unstructured text data that resides in the database, leading many to perform expensive and cumbersome export-import cycle of text data in order to analyze it outside of database.
We have developed technology and tools for efficient and scalable in-database NLP. Our technology allows flexibility and efficiency of query-driven processing to be applied to NLP:
- It is efficient and completely avoids expensive export-import cycle
- It allows full NLP pipeline to be embedded in the database
- It scales to gigabytes and terabytes of data
- It allows flexible, declarative NLP pipeline construction
- It allows NLP tasks take advantage of query optimizations and indexing
Contact us to find out how our technology can help with your text analytics.