I’m using this query to perform a full text search on a MySQL database:
SELECT DISTINCT questions.id, questions.uniquecode, questions.spam, questions.questiondate, questions.userid, questions.description, users.login AS username, questions.questiontext, questions.totalvotes, MATCH(questions.questiontext, questions.uniquecode) AGAINST ('rock guitarist chick*' IN BOOLEAN MODE) AS relevance FROM questions LEFT JOIN users ON questions.userid = users.id LEFT JOIN answer_mapping ON questions.id = answer_mapping.questionid LEFT JOIN answers ON answer_mapping.answerid = answers.id LEFT JOIN tagmapping ON questions.id = tagmapping.questionid LEFT JOIN tags ON tagmapping.tagid = tags.id WHERE questions.spam < 10 AND ( MATCH(questions.questiontext, questions.uniquecode) AGAINST ('rock guitarist chick*' IN BOOLEAN MODE) OR MATCH(answers.answertext) AGAINST ('rock guitarist chick*' IN BOOLEAN MODE) OR MATCH (tags.tag) AGAINST ('rock guitarist chick*' IN BOOLEAN MODE) ) GROUP BY questions.id ORDER BY relevance DESC
The results are very relevant, but the search is really slow and is getting slower and slower as the tables grow.
Table stats:
questions – 400 records
indexes
- PRIMARY BTREE – id
- BTREE – uniquecode
- BTREE – questiondate
- BTREE – userid
- FULLTEXT – questiontext
- FULLTEXT – uniquecode
answers – 3,635 records
indexes
- PRIMARY – BTREE – id
- BTREE – answerdate
- BTREE – questionid
- FULLTEXT – answertext
answer_mapping – 4,228 records
indexes
- PRIMARY – BTREE – id
- BTREE – answerid
- BTREE – questionid
- BTREE – userid
tags – 1,847 records
indexes
- PRIMARY – BTREE – id
- BTREE – tag
- FULLTEXT – tag
tagmapping – 3,389 records
indexes
- PRIMARY – BTREE – id
- BTREE – tagid
- BTREE – questionid
For whatever reason when I remove the tagmapping and tags JOINS the search speeds up considerably.
Do you have any tips on how to speed this query up?
Thanks in advance!
Advertisement
Answer
well you could combine your join into a cached view or extra table or something. have your query cache active and define your join as an select so it can be cached. ensure enough memory etc. but that shouldn’t be the bottleneck. well probably in your case it is because… only 400 records? thats nothing… and already slow? because the rest looks good. what sort of hardware/configuration are you running?
but well, i think this is the wrong approach. mysql isnt designed for that. in fact fulltext feature is limited to myisam.
you should consider using lucene/solr using the dismax request handler. it should give you good results in about 50ms-100ms with an index of some hundret thousand documents. at some point you can shard it so the number of records is pratically unlimited. plus you have better options and can achieve better results. for example do fuzzy matching or give more weight to newer documents or have tags more relevant than title, do post query analyzation, facetting, etc…