IBM is the largest IT Company in the world, and includes a Research Division with about 3000 employees in 8 labs world-wide. The Haifa Research Lab (HRL) is the largest of the five labs outside the United States, with 547 employees. Since it first opened as the IBM Scientific Center in 1972, HRL has conducted decades of research that have been vital to IBM's success. HRL staff members are actively involved in the academic community, publishing papers in leading conferences and journals, participating in program committees, and organizing conferences and workshops.

The Information & Interaction Technologies department of HRL focuses on the analysis, exchange, and management of the explicit information expressed in ustructured or semi-structured content and implicit information.

The Information Retrieval Group within that department focuses on the problem of information overload wherever large amounts of unstructured or semi-structured textual data are available on the desktop, in the enterprise, or on the World Wide Web. We provide search solutions for unstructured or semi-structured data, as well as disambiguation technologies for information mining. The main focus areas of the group are core search engine technologies, faceted search and business intelligence, Content Analysis, Social search and discovery, Big data indexing and search, and searching over entity-relationships graphs.

The technology behind these solutions stems from the well-established computer discipline of information retrieval (IR), which focuses on the representation, storage, organization, and retrieval of unstructured data in general and textual documents in particular. The main goal of this discipline is to further information discovery tasks such as searching and browsing. IR Group members actively participate in the IR academic community. Members of the group publish papers in leading IR conferences (SIGIR, CIKM, WWW, WSDM and more ) and journals, participate in program committees, and organize workshops and panels.

Contribution to CULTURA

HRL will bring to CULTURA its expertise in text analysis and social search. In particular, we will focus on the generation of rich social semantic networks from entities and relationships extracted through content analytics, and the exploitation of these networks for the derivation for information retrieval and discovery tasks.
We bring to the project SaND, a Social Networks & Discovery tool for information discovery and analysis of heterogeneous data sources in the enterprise. SaND leverages complex relationships between content and people as surfaced through social applications to unleash the value of information. Its integrated index supports combining content-based analysis and people-based analysis over a rich data foundation.
We intend to extend the SaND data model to represent the semantic entities and their inter-relations extracted from the analyzed content, and to enhance the SaND services such as social search, data discovery, personalization, and recommendation, over the CULTURA datasets, for the project goals.

Related Work

Team members from HRL participate(d) in several EU projects that are related to CULTURA:

  • SAPIR (STREP), which developed a P2P large-scale architecture for searching and retrieving complex multi-media and audio-visual documents;
  • ROBUST which deals with huge-scale business community cooperation, risk management, simulation and analysis;
  • SocIoS which paves the way for building qualitative, functional and usable business applications exploiting the User Created Content and the Social Graph of users in Social Networks.