JavaZone 2010 - Building a scalable search engine with Apache Solr, Hibernate Shards and MySQL
At Integrasco AS we've spent the last year developing a new data store and search engine for social media data mining. Our goal was a highly scalable design able to run on commodity hardware and with flexibility with regards to potential new future meta data and the ability to instantly make this available for searching.
This presentation will be an experience report from this project and how we achieved our goals and implemented a search engine for analysis of social and consumer generated media.
Using Hibernate Shards and MySQL for data storage, and Apache Solr for full text and meta data search we are now running a system which is easy to scale horizontally as our data and customer base continue to grow.
Aleksander Stensby
Aleksander holds a MSc in Computer Science from the University of Agder and Carleton University in Canada, specializing in Pattern Recognition and Textual Analysis. He has been working as a lead developer in Integrasco AS since 2004, focusing mainly on developing the search technology that the company use today.
Today, Aleksander holds the position as Analytics Director at Integrasco, being in charge of all client deliverables and client communications. He works closely with the R&D team to better build intelligent solutions that can benefit in providing both clients and in-house analysts with the best possible tools.
Aleksander has broad experience with Java development and technologies such as Lucene Search, Solr, Hibernate, Mysql, Spring and Hibernate Shards, with keen interests in search, machine learning, pattern recognition and scalability.
Jaran Nilsen
Jaran graduated with a Master of Computer Science in 2007. Since 2004 he has been working at Integrasco AS, developing web mining and search technology for analyzing social and consumer generated media.
Today he holds the position as Head of R&D at Integrasco and is in charge of the development of scalable data mining and search technology as well as intelligent processes based on machine learning and pattern recognition.
His interests range from the curious little code snippet to large scale system architecture and has spent most of his career this far fullfilling this interest on the Java platform.
