|
Track 1, Day 2:
Practical Relevance: Tips and tricks for Understanding and Improving Search Quality |
|
Grant Ingersoll
Slides
|
|
It is often assumed that a "search engine" does a good job of searching and finding relevant items for a query. In truth all search engines can and should be tuned, and Lucene and Solr are no exception. In this talk, Lucene/Solr committer Grant Ingersoll will provide practical tips for improving the relevance of search applications by focusing on time tested techniques in information retrieval as well as in-depth details on how to leverage Lucene and Solr's capabilities to get higher quality results.
Speaker bio: Grant Ingersoll, Chair of the Apache Lucene PMC, is a published expert in search and Natural Language Processing, with many articles published on Lucene, Solr, findability, relevance, and is co-founder of the Apache Mahout machine learning project. Grant's the author of the forthcoming book "Taming Text", from Manning publications. Grant is co-founder of Lucid Imagination.
|
Lucene Connectors Framework: An Introduction |
|
Karl Wright
Slides
|
|
This presentation is designed to be an introduction to the capabilities, design, and architecture of LCF. This will cover LCF basic concepts, the crawling model, and the security model. Emphasis will be on the framework itself, and not necessarily on details of specific connectors. There will be a demonstration using LCF in the context of Lucene/Solr, where we will build and use a system including LCF, Solr, and Lucene to index and search for geographic documents from the web.
Speaker bio: Karl Wright graduated from the Massachusetts Institute of Technology in 1982, and from Stanford University in 1984. Since then, Karl has been active in the areas of compiler development, speech recognition (where he holds several patents), and content management systems, having been chief architect of one such system for many years. Since 2005, Karl has been involved in the development of geographic text search for MetaCarta, Inc., and since its inception has been the primary developer of the software that is now the core part of Lucene Connectors Framework. Recently, Karl became an employee of Nokia, Inc.
|
Solr in the Cloud |
|
Mark Miller
Slides
|
|
Solr is reaching for the clouds with its next release, introducing new features to ease the deployment of Solr into large scale, distributed environments. Lucene/Solr committer Mark Miller will talk about the past inconveniences when scaling Solr across many servers, how the current Solr Cloud initiative addresses some of those issues, and look ahead to Solr Cloud features on the horizon. Learn how Solr is beginning to take advantage of Apache Zookeeper to launch into the cloud.
Speaker bio: Mark Miller is a member of the Lucene PMC, a Lucene Committer and a Solr Committer. He's most recently been leading work on Solr Cloud and per-segment searching.
|
Neural Networks, Newspapers and Solr - A short tour through extending Solr for real-world text-classification |
|
Sven Maurmann
Slides
|
|
This talk describes how to use Solr as a core component for a realization of context sensitive advertising in a regional news portal. The talk first discusses some basic topics of classification of large text corpora and relates the academic approach to real world problems like varying text length and structure. It presents some of the more popular classification algorithms like Naive Bayes, Neural Networks, Support Vector Machines and indicates how to use these algorithms in real code using Open Source implementations.
In the second part of the talk, Solr and its plugin extensibility mechanisms for search handlers and search components are explored: Using a simple yet useful example, a first search component is developed that constructs a query filter based on authorization credentials. In the final section this knowledge is then used to develop a second search handler that relates a newspaper article (or more generally a web page) together with some classification information to a set of advertisements and returns a subset based of a relevance ranking.
The talk is intended as a description of an implementation and as a tutorial of how to extend Solr without much effort. During the talk several useful libraries for text manipulation are visited.
Speaker bio: Born in 1961. Study of pure mathematics at the University of Bonn and the Max-Planck-Institute for Mathematics (MPI). Graduated in arithmetic algebraic geometry (Shimura varieties as moduli spaces for abelian varieties) in 1989. Assistant at the MPI from 1989 until 1998. Further research in mathematics and also head of the local IT department at the MPI. In 1998, founded kippdata informationstechnologie GmbH together with Reiner Jung. Since then consulting for a variety of clients from the public and private sector. Responsible for software development and research at kippdata. Currently using Lucene, Solr and various other components to build special search applications that have a need for content analysis and classification.
|
Combatting Information Overload - Search in Military Information Gathering Systems |
|
Alexandra Larsson
Slides
|
|
Good decisions stem from good information; this is true for both military and civilian enterprises. Vast amounts of time and resources are being invested in order to collect information. But to what end? Granted, somewhere among that information there is probably something you will find useful. But large amounts of information quickly become incomprehensible. In order to combat information overload you need a select-and-filter tool such as search, and that's where Findwise comes in. However, for FMKE it is not enough to simply locate the information they have available. Captain Alexandra Larsson, the intelligence officer in charge of leading the initiative, makes this fact very clear. It is just as important to get an idea of what information is not there. In essence, FMKE is in the process of creating information from information. This is also one of the great differences between the kind of web-based search and retrieval systems we have come to depend on and a state of the art knowledge management system. The latter is not just a retrieval tool; it is an information workbench where the user can select, retrieve, examine and manipulate information.
Speaker bio: Alexandra Larsson is the Concept Lead for the Knowledge Support project at JCDEC which addresses operational level intelligence and information and knowledge management procedures for JFHQ. She is also the overall IT Architect for the experimentation platform that is used to take concept ideas into solutions that can be tested with military and civilian staff in experiments at the Centre. Previously she served as R&D Officer at the Armed Forces Intelligence and Security Centre and before that as Squadron Intelligence Officer at a multi-role fighter squadron flying JAS 39 Gripen.
|
|
|
|
Agenda & Session Information
Agenda Overview
General Sessions
Track 1: Day 1 | Day 2
Track 2: Day 1 | Day 2
Training
|
|