Panel: The Future of Search in Plone, 2023 Edition

published Oct 05, 2023

Panel discussion at Plone conference 2023, Eibar, Basque Country.

Panelists are: Sally Kleinfeldt, Tiberiu Ichim, Timo Stollenwerk, Eric Steele, Eric Bréhault, Rikupekka Oksanen and Érico Andrei.

This panel provides a brief history and modern examples of Plone search, followed by a discussion of what improvements are needed - both from a marketing and technical perspective. We discussed this topic at the 2011 conference  and it will be interesting to see how our opinions have changed. The panel consists of people who have recently been active in Plone search advances.

Back in the 2000's it was: "Wow, a CMS with built-in search!" In the 2010's: "Wow, open source search engines are becoming really good." In 2020's: "Wow, we really need better search solutions on larger sites."
In 2011 we mentioned that for the navigation we need immediate update: a new item should be visible in the navigation immediately. But for search it is fine to have it a bit later. Solr/Elasticsearch have more features than the ZCatalog, there are armies of engineers behind them. We had collective.solr versus alm.solrindex. It felt like a good idea to ship with Solr/Elasticsearch integration, but not require it.
Do we need an easy Plone + Solr/Elasticsearch install? Do we need to choose between these two?

Timo: we use Solr on a regular basis, for most clients. For collective.solr we had a buildout solution which was supposed to make it easier, but it was adding an extra layer of indirection: it is better to rely on the Solr documentation. There should be a good default, and we can have a search control panel, but will need to learn about Solr to really configure it.

Rikupekka: we run small and large sites at the university. For small sites the standard Plone search is fine. For larger sites we use Solr. One problem with 50,000 documents is when hundreds have a title "Research". Would be nice to have a warning message then: "We already have this many documents with the same title, please be more specific."

Eric Steele: Would be good if we market this correctly.

Tiberiu: At EEA we use Elasticsearch. Lately alternative vector based solutions start popping up. Currently we simply fetch the html of the the page, just like Google Search does.

Guido: At Quaive we use Solr, so much better than Catalog. Tuning it to give more weight to some fields should be an easy way to improve the results.

Erico: We could get rid of ZCatalog in all Plone instances. If navigation works in one way, and search in a different way, it is going to be a nightmare to debug. If we have money, we should hire Nuclia.

Eric Brehault: It needs to be opinionated, configured correctly.

Sally: Solr is open source, Elasticsearch is not.

Timo: I don't care about Solr versus Elasticsearch, we can make any decision there. Integration is important: might be that collective.elasticsearch is doing some things smarter than

Guido: If you use an external service, you should remove the SearchableText index from the ZCatalog. And you need to make sure the indexers work: can you extract text from PDF, Word, etc.

Erico: We want real faceted navigation/search, like eea.facetednavigation did. Danger is that some add-ons do not work if the search is different. With a well defined search api, this should be no problem.

Erico: Sounds like there are benefits to each solution. People will want to choose. An abstraction layer will make it a lot safer.

Timo: Solr and Elasticsearch differ a lot, especially on how they handle facets. It is difficult to have an abstraction layer for this. And the responses will be different. If you try to transform the results so you get the same answer from Solr and Elasticsearch, then it kills performance.

Erico: It should be the same type of info as you get from the ZCatalog.

Guido: We need a solution that can handle and fix inconsistencies. The ZCatalog takes part in the transaction machinery in Plone, the external solutions typically do not.

 Philip: At one point in Sorrento (Plone Open Garden) we picked Solr and said it should be a first-class citizen in Plone. Just a single sentence in a Google Doc somewhere.