MoreLikeThis Queries with Spring Data Solr

Approaching MoreLikeThis Spring Data Solr queries (keywords: More Like This, MLT) may seem daunting at first, particularly since there is no specific MoreLikeThis functionality in the Spring Framework. There is also very little documented on the issue. Yet we're going to see how implementing MoreLikeThis Solr queries is as easy as adding a single Spring @Query Property! Well, almost as easy.

Let's first look at our Spring Boot console output.

We're doing two queries here, the first to retrieve a post by PostId, then our MoreLikeThis query with the same filter, or in this case PostId=435.

Service Layer

We'll next cover the PostDocService Implementation class method which you can see is nothing different than most any other Service Level method.

Repository Query

Our Repository Query is straight-forward as well. But wait a minute, where is all of the other Java code? Can it be possible that all we have to do in Spring to perform MoreLikeThis searching is add a @Query requestHandler property? The short answer is definitely “yes”, with a little configuration help from our Solr Server friends.

Before continuing, you probably already know that we could use different Spring Query Types to get the same results in 100% Java, but I like the simplicity of this single @Query property and Solr Server Config combo.

Adding a Solr Server RequestHandler

Our Golden Ticket for MoreLikeThis searching in Solr is the /mlt RequestHandler which we will define in our Solr Core solrconfig.xml file. We'll take care of the various MoreLikeThis properties here.

For good measure we will also add the termVectors property to our MoreLikeThis fields in our Solr schema.xml file. These are not required, but with termVectors built into the Index when the records are created, the stored field content relationships don't have to be processed at runtime.

The Solr Server Search Equivalent

It is often helpful to understand how searches are performed on Solr Server before writing any Java code to produce the same results. Below we're doing a search on our Solr Server using our /mlt handler (1). We retrieve our matching document (2), then display all MoreLikeThis posts in our Response Object (3). The 523 posts are all those remaining in the index, ranked the highest to the lowest in similarity to the “match” document.

Again, the results. I'd say it works!

Source Code Notes for this Post

Source code discussed in this post can be found in my NixMash Blog project located on GitHub.