A small, fast, efficient computational linguistics model forms the foundation of Enginius capabilities. The Enginius engine is optimized to accurately model the semantics and context in technical micro-domains.
The Enginius engine starts with one of several cutting-edge natural language models. By focusing on semantic representation, the Enginius engine is able to achieve world-record performance on several benchmarks (CoLA, QQP, MLNI). A key differentiator for the Enginius engine is the model size (less than 1% the size of other models such as GPT-3). Another differentiator is Enginius speed — a critical requirement for processing large technical repositories responsively. The Enginius semantic engine was initially developed under USAF, DARPA, and MDA innovation contracts.
Most technical organizations develop jargons and abbreviations to efficiently communicate technical information. These jargons are dynamic—changing over time—and differ based on disciplines and locations.
Enginius automatically scans document repositories to build and maintain the technical dictionary. The Enginius dictionary leverages a custom syntax structure to organize terminology. Manual supervision is integrated to ensure quality.
The Enginius dictionary is designed to support even the most complex, large-scale organizations. It can handle multiple expansions for the same abbreviation and different meanings for the same term. By improving jargon understanding, the Enginius dictionary enhances communication across engineering teams and integrates seamlessly with the semantic search engine to boost search accuracy.
Traditional keyword search falls short when faced with complex jargons and abbreviations. Standard page ranking also fails, as most technical documents aren’t accessed frequently enough for ranking engines to learn usage patterns.
Enginius integrates its semantic engine and technical dictionary to deliver a fast, accurate search experience. It automatically generates synonyms for user queries and prioritizes results based on technical context.
The Enginius search engine enables rapid and precise search across complex technical repositories. With integration into the AI-powered document structure reconstruction model, Enginius search can take users directly to the relevant paragraph within any document. Our regulation search tools demonstrate just some of what Enginius search can do.
File formats like PDF don’t preserve document structure—such as paragraphs, headings, or tables. Each line is treated as a block of text, making it hard to accurately interpret the semantics of technical content.
Enginius is developing highly accurate (>99%) AI models to reconstruct document structure automatically, without human input.
Enginius structure reconstruction powers precise semantic search across PDFs. It allows the search engine to jump to the right paragraph in large documents and enhances overall search effectiveness. Accurate structure also enables the AI dictionary to better build and maintain technical terminology.