Why does this simple query take forever?

Go To StackoverFlow.com

0

Why does this MySQL query take forever (and never finishes) on a table that has 17k rows?

SELECT * FROM files_folders WHERE file IN (SELECT file FROM files_folders WHERE folder = 123);

Basically, a file can be in several folders (a physical file and its copies). I'm trying to get all the files in folder 123. Right now in my example there's 2 files in folder 123. ID #4222 & ID #7121. But those 2 files could be in other folders as well as folder 123.

Am I doing this the wrong way or is there something I'm missing?

Edit: Here's an example of the table structure.

+--------------+
| file | folder|
+------+-------+
| 1    | 1     |
| 2    | 1     |
| 1    | 2     |
| 3    | 2     |
| 4    | 3     |
+------+-------+

So I want to select all files (and its copies) that are in folder 1 which would return:

+--------------+
| file | folder|
+------+-------+
| 1    | 1     |
| 2    | 1     |
| 1    | 2     |
+------+-------+

Because file 1 is both in folder 1 and 2.

Thank you.

2012-04-03 19:37
by ademers
do you have an index on Folder - Daniel A. White 2012-04-03 19:40
Am I missing something? Why not just: SELECT * FROM files_folders WHERE Folder = 123;Alex Howansky 2012-04-03 19:43
Why not just SELECT * FROM files_folders WHERE Folder = 123? Currently you're selecting the ID where Folder = 123 and then essentially selecting * where ID = ID - David 2012-04-03 19:43
A file can be in multiple folders. So, basically I want to delete every copy of the file including the ones that are stored in Folder because a file can be in a folder and copies of said files could be elsewhere - ademers 2012-04-03 19:47
because mysql sucks - piyush 2013-07-07 11:24


1

Use a self join:

SELECT 
  ff.* 
FROM 
  files_folders AS ff
  INNER jOIN files_folders AS f ON f.ID=ff.ID
WHERE
  f.Folder=123
;
2012-04-03 20:02
by Eugen Rieck
That did the trick. Thanks - ademers 2012-04-03 20:04
The real query I was trying to do is a DELETE. The SELECT works fine but as soon as I incorporate it into a DELETE statement, it takes forever again. I've replaced the SELECT ff.* from your query to DELETE ff. Any thoughts on why it takes a long time? Thanks - ademers 2012-04-03 21:08
With a DELETE operation, the self-join is invalidated on every row deleted, which negates the performance win. For the DELETE your best bet ist to select the folder IDs in one query, then run a DELETE query on the resulting ID list. This way the argument to IN (...) is constant, resulting in a fast delete - Eugen Rieck 2012-04-03 22:07
I don't think I get pass a list of IDs in a MySQL IN function. I can only do a subquery. http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html#function_i - ademers 2012-04-04 13:38
Ofcourse you can - I do it all the time! This is the easiest Version: First query: SELECT CAST(GROUP_CONCAT(DISTINCT file) AS CHAR) FROM files_folders WHERE folder = 123 - this gives back a string like 1,2,17. Then run "DELETE FROM files_folders WHERE file in (0" + string_from_last_query+")"Eugen Rieck 2012-04-04 13:47
Well, that seems to be done with a scripting language like PHP. I was trying to accomplish this in straight SQL - ademers 2012-04-04 14:43
I am sure, you are not inputting the files directly into SQL, but use some other language for your application. You can also use a stored procedure, with a cursor running over the inner query and repeating single instances of the outer query - this will be much faster than the original - Eugen Rieck 2012-04-04 16:52


2

For each file, MySQL need to check if ID is in results returned by subquery. It takes O(N).

It need to be done for N files.

So complexity of your query is O(N^2). 17k ^ 2 = ~4*10^8 so it should take around a minute, maybe less.

Why your query isn't

SELECT ID FROM files_folders WHERE Folder = 123

?

2012-04-03 19:45
by Jarosław Gomułka
Because a file can be in another folder also. I'll update my post with an example of the table structure - ademers 2012-04-03 19:49


-1

Why are you using sub query? I don't think it's needed at all. You can just select directly from table like

SELECT * FROM files_folders WHERE Folder = 123

and a second thing:

"Because a file can be in another folder also"

What does it means to use sub query?

2012-04-04 06:14
by Ankit Sharma
Fix your grammar next time you answer a question please. It's hard to understand what you are trying to say if you have bad grammar - ragingasiancoder 2016-07-01 13:32