Improving Technical Management

Improving the speed of relevance of technical knowledge search

A global engineering company  that helps automotive, aerospace,  energy, and consumer products  producers turned to Intuceo to  improve its technical knowledge  management. The client required  an algorithm that could quickly yet  accurately search unstructured data  in k-pacs.

The solution, to be integrated  into the client’s enterprise  knowledge system, needed to resolve issues in the data and score  documents by relevance to search  query. Its present system took too  long and failed to yield the  expected results. A major challenge  was to integrate its existing  engineering workflows.

The client provided 94,344  documents, 10,000 of which were  used to develop the model. Our  DataSharpTM preprocessor  merged, cleaned, and categorized  the text, and a semantic dictionary,  including frequency of occurrence,  was built. The Intuceo team stemmed  related terms, removed  unnecessary punctuation, replaced original content with  dictionary terms, constructed a tf-  idf matrix and filtered out terms  that did not occur frequently across  documents. K-means clustering  used, with cosine similarity as the  distance measure (see the figure),  to obtain the most similar  documents; in this way, non-  quantitative data was numerically  tagged and the “distance” (i.e.,  degree of likeness) among  documents compared.

The solution lists the 10 most  similar documents based on a  search query. The total search time  has been reduced from minutes to  seconds and relevance has been  optimized. The client has  knowledge on demand that  increases the productivity and  capabilities of its engineers, ensures  a consistent engineering process,  and prevents recurring errors.


    Contact Us

    Please tell us some specific needs

      Enter Your Details..