Critical Capabilities for Your Enterprise Search Solution

栏目: IT技术 · 发布时间: 5年前

Enterprise Search , Personalization Engines

Critical Capabilities for Your Enterprise Search Solution

Enterprise search tools and features you must have for your next deployment.

byAndy Wibbels on June 10, 2020

Critical Capabilities for Your Enterprise Search Solution

The Four Types of Enterprise Search Platforms and Intranet Search Engines

Andy Wibbels

What’s Next in Enterprise Search Software

Andy Wibbels

Critical Capabilities for Your Enterprise Search Solution

A couple weeks back, we talked about the dumpster fire that is terrible enterprise search along with the path to something much better. Then, last week we did quick taxonomy dive into the four main types of enterprise search engine. This week we’re looking at the shopping list to get you going.

Starting the search for your next enterprise search solution? Or maybe you’re under the gun to replace the creaky, dusty, legacy system you already have in place? Either way, here’s a rundown of the top features and capabilities to consider as you research and shop for a powerful enterprise search experience.

Search Capabilities

Keywordsand key phrases are one of the most familiar components of the search experience. A query parser receives a query and translates it into parameters for a search algorithm, which is executed across a specialized database. Older search technologies required a specific syntax where a user would explain in a specialized language exactly what they wanted. Modern search parsers such as Solr’s DisMax parser work without this specificity. Advanced queries such as geospatial search still require specialized syntax. Most of the time, this specialized syntax is computer-generated by a search application built to speak to the search engine.

Facetingis a critical capability that allows the user to filter results efficiently based on a specific field. This is especially useful for limiting a search to a category or department. Faceting can also be used for range filtering like date or price. Most search solutions allow you to define synonyms. This capability enables implementers to adapt user queries to their corpus of data. For example, implementers may define “lawyer” as a synonym for “barrister.” That will allow searches for “lawyer” to return documents that also might contain the term “barrister.”

Signal captureis the capture of user behavior data or “signals.” In most search applications, signals include events such as user queries, clicks, adds, purchases and other similar clickstream data. However, in advanced uses, this may include user location, vector, altitude or any number of event type data. Signal capture is a collaboration between the search application, the back-end search solution and the front-end search application. The application captures the actual click and sends it to the search solution, which processes the event and stores it for later use. Newer features, such as recommendations, depend on signal capture.

Artificial intelligenceand recommendations are newer techniques for finding users’ relevant data. Not all data lends itself perfectly to keyword or key phrase search. For example, the most relevant result for a query may not be the one that contains the matching keyword. One way of solving this problem is by looking at what users clicked on most often for that query. However, that may not be enough. Context is important, and the user is important. Instead of showing results that most users clicked on, it may be more relevant to show what similar users clicked on. There are numerous algorithms and techniques for using user behavior data to help users find exactly what they need, even before they know they need it. This is the next frontier of enterprise search.

Query pipelinesallow implementers to change queries in stages and to change the data that is returned. The stages approach is a critical tool to use in order to handle complex data or complex queries or to provide behavior profiling functionality. Some solutions offer prepackaged functionality as well as extensibility using JavaScript or other programming languages.

UI Capabilities

A powerful search back-end is important, but users only see the front-end user interface (UI). Previously, some search solutions left this entirely to the implementer, but now some offerings are providing UI functionality that allows implementers to import or compose their UI instead of having to create it from scratch. This makes a lot of sense given most search UIs do similar things. Moreover, given that AI and personalization functionality often collaborate with UI, modern UIs are too complex to write and maintain by hand without a larger staff of experts.

WYSIWYG embeddingis the most advanced functionality in the marketplace. This allows implementers to configure a search UI in a web-based administration tool and then “include” it on their site using HTML or JavaScript statements.

Smart panelsand widgets combine back-end functionality, like recommendations or similarity search (aka “More Like This”), with UI components. These allow implementers to include this functionality in their UI without having to write the underlying UI or back-end handling code. In cloud-based solutions, these come with preconfigured, back-end implementations.

Component librariesprovide common UI functionality, such as typeahead. These exist in many forms from tag libraries or JavaScript APIs.

Although technically a back-end component, REST connectivity is critical to any modern search UI. Most mobile and web UIs connect via JSON over a REST interface.

Typeaheadand other forms of auto-suggest are now standard user expectations in search UI. As users type, the UI suggests what they’re likely to be interested in.

Auto-classificationis a more advanced form of typeahead. For example, when a user types “speaker,” and then “audio electronics” is automatically selected as a category.

Data Import

Data source connectorsare an important piece of most search solutions. While REST APIs allow users to import from nearly any data source, having to write a connector to every common data source (i.e., Oracle, SQL Server or SharePoint) is a taxing endeavor for implementers. The most important question isn’t whether the solution supports the most connectors but whether the solution supports your data sources and any you are likely to deploy.

Parserswork in conjunction with data source connectors to process the data that comes back and turn it into documents. For example, if you’re scanning a local disk and pulling back files, should each file be loaded as a document or each row in the file? Is the document an XML or a CSV or a ZIP file? Parsers interpret the data into documents so that they can be further processed or indexed.

Pipelinesare used to connect data sources, parsers and stages of logic used to manipulate data into well-formed documents. In older systems these were known as ETL processes – extract, transform, and load.

JavaScriptis the modern scripting language that serves as a kind of “language of trade” for most developers. Because of that familiarity, some search solutions allow manipulating data with JavaScript.

A REST API allows operation control of the search solution as well as importing and exporting data.

Native librariesallow a search solution to bind into a system language like Java or Python without the implementer having to write REST API glue code.

Operational Capabilities

Scalabilityand capacity are important differentiators among search solutions. Can the system scale to the number of documents and users your system needs to support? How hard is it to add additional capacity and can that be done without significant downtime? Some solutions still use client-server architecture or rely on older computing technologies like shared file systems (NAS) instead of modern clustering topologies.

High availabilityis the capability of the system to suffer a hardware failure without data loss or downtime. This is a feature of a modern cluster topography. Modern search solutions should be resilient against network outages and multiple nodes of hardware faults.

Disaster recoveryis the capability of the system to suffer the loss of a complete cluster or data center and failover to a backup site. This requires cross-data-center replication (also called WAN replication). This is important to deal with fiber cuts, weather, earthquakes or other unforeseen catastrophes without a major impact to business operations.

System monitoringis provided by modern search solutions in the form of REST APIs that provide statistics about system performance and uptime, including graphical displays and dashboards.

A/B testingshows admins whether changes to search pipelines or other functionalities improve search performance for users. When applying personalization, recommendations or other AI techniques, it is important to determine whether the changes actually improved click-through rates, purchases or any specified measure of success. A/B testing works by directing some traffic to the new pipeline and comparing it to the original configuration.

Security

Connectivity to major security technologieslike Active Directory, LDAP, Kerberos and SAML or other single sign-on systems is critical to a search solution’s security capabilities.

Role-based security authorizationto determine which users are allowed to delete, read, modify or create documents as well as enact system changes.

Document-level securitymethods like security trimming to allow for fine-grained control to ensure that users don’t see documents they don’t have access to as a query result.

Analytics Capabilities

Usage analyticsallow implementers to inspect how users interact with search and may even allow inspecting the actions of an individual customer. This capability allows implementers to understand how well they’re achieving their conversion goals and see changes in these metrics over time.

SQL connectivityallows analysts to chart and use data with common SQL tools. Solutions may make data as well as user behavioral data available via SQL.

Advanced Functionality

Streamingallows the solution to operate differently than normal. Usually, search solutions return the most relevant items first, but computing this is memory and resource intensive. Streaming allows results to be returned in the order they are retrieved and can return results based on conditions.

Named entity recognition (NER)is the use of natural language processing (NLP) to recognize the names of companies, individuals or other proper nouns. This can be useful for various types of filtered or faceted search.

Clusteringand classification are machine learning techniques that allow data or queries to be grouped or labeled automatically.

Head/tail analysisis a machine learning technique that identifies and rewrites underperforming queries to be more like similar well-performing queries.

Delivering Outcomes at Scale

In addition to a rundown of all of these features essential to a platform, you need to make sure your vendor has a proven track record of experience helping organizations like yours deploy these types of solutions. A vendor that can truly partner with you to help you configure and design an enterprise search solution that fits your unique mix of data sources, users, and business problems – and specifically how AI, machine learning, and deep learning can fit in.. Look for vendors with both broad experience across industries but also can articulate and understand the idiosyncrasies of your particular business.

Let’s Get Going

It’s time to replace hit-or-miss search with an all-in-one answer platform for data diggers, fact finders, and edge seekers everywhere. More than anything, it’s time to find out what’s possible when employees have all the insights they need, whenever they need them.Contact us today or use the form below:

以上所述就是小编给大家介绍的《Critical Capabilities for Your Enterprise Search Solution》，希望对大家有所帮助，如果大家有任何疑问请给我留言，小编会及时回复大家的。在此也非常感谢大家对码农网的支持！

查看所有标签

猜你喜欢:

Critical Capabilities for Your Enterprise Search Solution

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

智能Web算法

Haralambos Marmanis、Dmitry Babenko / 阿稳、陈钢 / 电子工业出版社 / 2011-11 / 65.00元

本书涵盖了五类重要的智能算法：搜索、推荐、聚类、分类和分类器组合，并结合具体的案例讨论了它们在Web应用中的角色及要注意的问题。除了第1章的概要性介绍以及第7章对所有技术的整合应用外，第2～6章以代码示例的形式分别对这五类算法进行了介绍。本书面向的是广大普通读者，特别是对算法感兴趣的工程师与学生，所以对于读者的知识背景并没有过多的要求。本书中的例子和思想应用广泛，所以对于希望从业务角度更好......一起来看看《智能Web算法》这本书的介绍吧!

码农工具

Critical Capabilities for Your Enterprise Search Solution