We are building an ecommerce application. We are using JAVA stack with Hibernate and Spring Framework. As with all ecommerce application, we need to build search capability into ours.
So, we came across Hibernate Search and Apache Solr . Can someone list out the pros and cons of both of them so that we 开发者_JAVA百科can select the ideal solution for Enterprise Search?
Say you are using hibernate for the persistent layer of your web application with annotation based configuration. Then, you can use same model classes(like the one i given below) used for annotation to set them index in the Solr server using Solr server specific annotation.
i will give you example where this is done.
Following class is a Customer Model Class without Solr annotations.
@Entity
@Table(name="Customer")
public class Customer {
private int customerId;
private String customerName;
private String customerAddress;
@Id
public int getCustomerId() {
return customerId;
}
public void setCustomerId(int customerId) {
this.customerId = customerId;
}
public String getCustomerName() {
return customerName;
}
public void setCustomerName(String customerName) {
this.customerName = customerName;
}
public String getCustomerAddress() {
return customerAddress;
}
public void setCustomerAddress(String customerAddress) {
this.customerAddress = customerAddress;
}
}
Now lets annotate this class with Solr annotations to index Customer details in Solr Server.
@Entity
@Table(name="Customer")
public class Customer {
@Field
private int customerId;
@Field
private String customerName;
@Field
private String customerAddress;
@Id
public int getCustomerId() {
return customerId;
}
public void setCustomerId(int customerId) {
this.customerId = customerId;
}
public String getCustomerName() {
return customerName;
}
public void setCustomerName(String customerName) {
this.customerName = customerName;
}
public String getCustomerAddress() {
return customerAddress;
}
public void setCustomerAddress(String customerAddress) {
this.customerAddress = customerAddress;
}
}
Just put @Field attribute for filed that you want indexed in Solr server.
Then the problem is how to tell solr to index this model. it can be done as follows.
Say you are going to persist a customer called alex in the database, then we will add data to the alex as follows
Customer alex = new Customer();
alex.setCustomerName("Alex Rod");
alex.setCustomerAddress("101 washington st, DC");
and, after saving this alex object to database, you need to tell solr to index this data object. it is done as follows.
session.save(alex);
session.getTransaction().commit();
String url = "http://localhost:8983/solr";
SolrServer server = null;
try {
server = new CommonsHttpSolrServer(url);
server.addBean(alex);
server.commit();
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
This is all about solr indexing with the use of Hibernate Technology. it is pretty straight forward.i have explained you the basic idea of how to use it. i got this example from a commercial application where we used above method to implement search functionality
In addition to what has been said, when in a clustered environment:
Hibernate-search:
Cons:
- Requires a master/slave combination which isn't always feasible, specially when your build/deployment process doesn't distinguish among the nodes (same war for all nodes).
- The indexes are hosted in the same server/process as the application running Hibernate, so you have one index per application node. This is sometimes overkill.
- It isn't real-time search, unless the load balancer uses session stickiness.
Pros:
- Zero to little configuration. Just drop the jar in the classpath.
- The bridge between Hibernate and Lucene is very straight forward. Just annotate the Entities and voilá!
Solr/SolrCloud:
- It is decoupled of the application it self.
- Not real-time search, just as hibernate-search.
- Requires restart to change the schema.
- SolrCloud isn't exactly the easiest framework to configure.
- No straight forward Hibernate bridge. You have to code your own Hibernate listener and bind them to post-[insert|delete|update] events (or find an open source one)
ElasticSearch
- Servers are independent of the application, just like solr.
- It is by far the easiest to configure in a cluster/cloud.
- It is real-time
- No straight forward Hibernate bridge, as well. (es-hibernate-connector on GitHub)
Personally I prefer ElasticSearch when running in the cloud.
Apache Solr is mainly used for full text search: if you want to find words (singular and plurals for example) in a big set of documents where the size of each doc is from one paragraph to a few pages. Solr may not be better than a regular database if you don't use it for text search but only for int and varchar search.
This link might be useful to you:
http://engineering.twitter.com/2011/04/twitter-search-is-now-3x-faster_1656.html
There is another alternative which is using them both together and combining their pros together.
Have a look at: Combining the power of Hibernate Search and Solr
I'm using them together and it works fine.
Hibernate search provides me all the entities annotations & analysis and changes collection in transaction boundaries while Solr provides me the best search engine with great features as 1:m facets, clusters, etc...
It sound like you need to read up on the pros and cons of each of these. There is extensive documentation available.
If you wanted my opinion I would say that it makes sense to use Hibernate Search with Hibernate. The updating of search indexes occurs when hibernate performs database operations and only when a database transaction is committed.
Hibernate search is a "bridge" between Hibernate and Lucene. In other words, it makes persisted Hibernate entities automagically searchable in Lucene index.
Solr is a framework built on top of Lucene (both projects are supposed to be merged one day, but it's a long way to go). Differences between Solr and Lucene are explained in another SO post.
精彩评论