Monday, January 21, 2019

Solr 7.6 - Solr Cloud installation

Solr 7.6 Getting Started

Solr documentation is huge.  In the time constrained age, of interest are the quick getting started.  As such the precise commands used to set up, and running SolrCloud7.6 with a three node external zookeeper ensemble is documented here.

Steps to create Solr Cloud:

  1. Install Zookeeper Ensemble
  2. Add a node
  3. Start Solr using the zookeeper configured in Step 1.
  4. Create ConfigSet
  5. Create Collection
  6. Access SolrCloud GUI

Install Zookeeper

Three node external zookeeper ensemble is used.  The three nodes used are:
  • wolf:2181
  • tiger:2181
  • lion:2181
The variable ZK_HOST would be wolf:2181,tiger2181,lion:2181

Add node

Adding a node to Solr Cloud provides detailed instructions to add a node to the cloud. 

Start Solr

The SolrCloud is also installed on the same nodes.

Start Solr using external zookeeper ensemble

<SOLR_HOME>/bin/solr start -c -z $ZK_HOST

Create ConfigSet

Without getting into detailed explanation of schema.xml, solrconfig.xml and other related configuration files, Sample Configuration files can be used to quickly get started.  Following script would generate a ConfigSet with the name tech, by uploading the configuration files stored in the configuration directory ~/solr/tech/.

#!/bin/sh

SOLR_HOME=~/solr-7.6.0
ZK_HOST=wolf:2181,tiger:2181,lion:2181
CONF_NAME=tech
CONF_DIR=~/solr/tech/

$SOLR_HOME/server/scripts/cloud-scripts/zkcli.sh -z $ZK_HOST -cmd upconfig -confname $CONF_NAME -confdir $CONF_DIR

Configure Zookeeper

Modify ZK_HOST in the file ~/SOLR_HOME/bin/solr.in.sh to read the ZK_HOST

ZK_HOST="wolf:2181,tiger:2181,lion:2181"

Create Collection

Following script can be used to create a new collection by the name mytech using the ConfigSet tech as created previsously with 3 shards per collection, and a replication factor of 2. 


#!/bin/sh



SOLR_HOME=~/solr-7.6.0
COLLECTION_NAME=mytech
CONFIG_NAME=tech
SHARD_COUNT=3
REPL_FACTOR=2
$SOLR_HOME/bin/solr create -c $COLLECTION_NAME -n $CONFIG_NAME -s $SHARD_COUNT -rf $REPL_FACTOR


Access SolrCloud GUI

The SolrCloud thus created can be accessed using the url http://wolf:8983/solr/.  The intutive user interface can be used to explore the various functionality provided by Solr.

Sample Data

Following sample data can be used to insert data from SolrCloud GUI:
id,cat,name,price,inStock,author,series_t,sequence_i,genre_s
0553573403,book,A Game of Thrones,7.99,true,George R.R. Martin,"A Song of Ice and Fire",1,fantasy
0553579908,book,A Clash of Kings,7.99,true,George R.R. Martin,"A Song of Ice and Fire",2,fantasy
055357342X,book,A Storm of Swords,7.99,true,George R.R. Martin,"A Song of Ice and Fire",3,fantasy
0553293354,book,Foundation,7.99,true,Isaac Asimov,Foundation Novels,1,scifi
0812521390,book,The Black Company,6.99,false,Glen Cook,The Chronicles of The Black Company,1,fantasy
0812550706,book,Ender's Game,6.99,true,Orson Scott Card,Ender,1,scifi
0441385532,book,Jhereg,7.95,false,Steven Brust,Vlad Taltos,1,fantasy
0380014300,book,Nine Princes In Amber,6.99,true,Roger Zelazny,the Chronicles of Amber,1,fantasy
0805080481,book,The Book of Three,5.99,true,Lloyd Alexander,The Chronicles of Prydain,1,fantasy

080508049X,book,The Black Cauldron,5.99,true,Lloyd Alexander,The Chronicles of Prydain,2,fantasy


Sample Query

Once the data has been added to the cloud, Solr search can be executed. Screenshot of Sample Query is shown below:


Solr 7.6 Creating Collection using ConfigSets, Errors


Solr7.6 Creating ConfigSet

Solr ConfigSets API provides rest end points for creating, listing, viewing, and deleting ConfigSets.  While trying to create a collection based on the ConfigSet created using the Rest Service as documented:
$ (cd solr/server/solr/configsets/sample_techproducts_configs/conf && zip -r - *) > myconfigset.zip

$ curl -X POST --header "Content-Type:application/octet-stream" --data-binary @myconfigset.zip "http://localhost:8983/solr/admin/configs?action=UPLOAD&name=myConfigSet"

results in the following error message:

ERROR: Failed to create collection 'tech' due to: {192.168.56.102:8983_solr=org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.102:8983/solr: Error CREATEing SolrCore 'tech_shard3_replica_n10': Unable to create core [tech_shard3_replica_n10] Caused by: The configset for this collection was uploaded without any authentication in place, and this operation is not available for collections with untrusted configsets. To use this component, re-upload the configset after enabling authentication and authorization., 192.168.56.104:8983_solr=org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.104:8983/solr: Error CREATEing SolrCore 'tech_shard1_replica_n2': Unable to create core [tech_shard1_replica_n2] Caused by: The configset for this collection was uploaded without any authentication in place, and this operation is not available for collections with untrusted configsets. To use this component, re-upload the configset after enabling authentication and authorization., 192.168.56.101:8983_solr=org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.101:8983/solr: Error CREATEing SolrCore 'tech_shard2_replica_n4': Unable to create core [tech_shard2_replica_n4] Caused by: The configset for this collection was uploaded without any authentication in place, and this operation is not available for collections with untrusted configsets. To use this component, re-upload the configset after enabling authentication and authorization.}

The documentation says that functionality of uploading a configset is enabled by default, but can be disabled via a runtime parameter -Dconfigset.upload.enabled=false. Disabling this feature is advisable if you want to expose Solr installation to untrusted users (even though you should never do that!).

Instead of using the REST service for creating the configset, used the Solr's zkcli.sh file found under <SOLR_HOME>/server/scripts/cloud-scripts/, as shown below:


#!/bin/sh


SOLR_HOME=~/solr-7.6.0
ZK_HOST=wolf:2181,tiger:2181,lion:2181
CONF_NAME=tech
CONF_DIR=~/solr/tech/
$SOLR_HOME/server/scripts/cloud-scripts/zkcli.sh -z $ZK_HOST -cmd upconfig -confname $CONF_NAME -confdir $CONF_DIR


After executing the above command a Solr collection by the name tech as defined in the above script was successfully created.

Sunday, January 13, 2019

Delete all documents from Solr

How can I delete all documents from my index? documented on wiki shows the following commands which can be executed in the browser:
http://localhost:8983/solr/update?stream.body=<delete><query>*:*</query></delete>
http://localhost:8983/solr/update?stream.body=<commit/>
While trying to execute the same in SolrCloud 7.6, you get the message that stream.body is disabled.
RequestDispatcher shows the command to configure remote streaming.  Following is the command:
curl -H 'Content-type:application/json' -d '{"set-property": {"requestDispatcher.requestParsers.enableRemoteStreaming": true}, "set-property":{"requestDispatcher.requestParsers.enableStreamBody": true}}' http://localhost:8983/api/collections/