My Multithreaded Merge Sort Algorithm takes 10+ minutes to s
Moderator: Coders of Rage
- davidthefat
- Chaos Rift Maniac
- Posts: 529
- Joined: Mon Nov 10, 2008 3:51 pm
- Current Project: Fully Autonomous Robot
- Favorite Gaming Platforms: PS3
- Programming Language of Choice: C++
- Location: California
- Contact:
My Multithreaded Merge Sort Algorithm takes 10+ minutes to s
So this is for my AP Computer Science class in high school (equivalent to 2nd semester of Comp Sci 101) that focuses on data structures. So we have this little contest on who can make the fastest sorting algorithm. This current algorithm only takes 450 ms for sorting 20,000 strings. I do not know how that is compared to others, but seems like I'm on the right track. It handles 100,000 strings fine; less than 7 seconds (all these numbers include reading and writing the files) However, it just is a slug when it comes to 1 million strings. I tried to minimize the copying of data and just kept it to accessing using pointers (From what I've been told, all objects in java are pointers) Is this a nature of merge sort in itself or is it a flaw in my methodology?
Keep in mind, my system is a quad core cpu so I purposely made it only have 4 active running threads per file being sorted at any given time, but I've heard that the rule of thumb is 1.5 * the # of cores. The overhead from initiating the threads are negligible IMHO. It takes about 1/4 the time a single threaded merge sort does on my PC. And I am very aware that I should be commenting my file... Just a habit of never commenting for years.
I purposely did not join the sort threads until I started all of them by design. I know that it would be sluggish around 5+ files, but I will be testing one file at a time.
My sort method: [1] http://pastebin.com/YwHU4BPz
My "main" file: [2] http://pastebin.com/9kj4fpXL
Keep in mind, my system is a quad core cpu so I purposely made it only have 4 active running threads per file being sorted at any given time, but I've heard that the rule of thumb is 1.5 * the # of cores. The overhead from initiating the threads are negligible IMHO. It takes about 1/4 the time a single threaded merge sort does on my PC. And I am very aware that I should be commenting my file... Just a habit of never commenting for years.
I purposely did not join the sort threads until I started all of them by design. I know that it would be sluggish around 5+ files, but I will be testing one file at a time.
My sort method: [1] http://pastebin.com/YwHU4BPz
My "main" file: [2] http://pastebin.com/9kj4fpXL
- short
- ES Beta Backer
- Posts: 548
- Joined: Thu Apr 30, 2009 2:22 am
- Current Project: c++, c
- Favorite Gaming Platforms: SNES, PS2, SNES, SNES, PC NES
- Programming Language of Choice: c, c++
- Location: Oregon, US
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
ok your going for speed ..... are you forced to write it in java...?
(you know somebody was going to ask, might as well be me)
(you know somebody was going to ask, might as well be me)
My github repository contains the project I am currently working on,
link: https://github.com/bjadamson
link: https://github.com/bjadamson
-
- Chaos Rift Cool Newbie
- Posts: 85
- Joined: Thu Jun 23, 2011 11:12 am
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
A little overhead really adds up if your making threads for everything you sort
- davidthefat
- Chaos Rift Maniac
- Posts: 529
- Joined: Mon Nov 10, 2008 3:51 pm
- Current Project: Fully Autonomous Robot
- Favorite Gaming Platforms: PS3
- Programming Language of Choice: C++
- Location: California
- Contact:
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
Yes, I am forced to write in java because the class is taught in java.short wrote:ok your going for speed ..... are you forced to write it in java...?
(you know somebody was going to ask, might as well be me)
- LeonBlade
- Chaos Rift Demigod
- Posts: 1314
- Joined: Thu Jan 22, 2009 12:22 am
- Current Project: Trying to make my first engine in C++ using OGL
- Favorite Gaming Platforms: PS3
- Programming Language of Choice: C++
- Location: Blossvale, NY
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
short wrote:ok your going for speed ..... are you forced to write it in java...?
(you know somebody was going to ask, might as well be me)
There's no place like ~/
- GroundUpEngine
- Chaos Rift Devotee
- Posts: 835
- Joined: Sun Nov 08, 2009 2:01 pm
- Current Project: mixture
- Favorite Gaming Platforms: PC
- Programming Language of Choice: C++
- Location: UK
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
The joys of life loldavidthefat wrote:Yes, I am forced to write in java because the class is taught in java.short wrote:ok your going for speed ..... are you forced to write it in java...?
(you know somebody was going to ask, might as well be me)
- Ginto8
- ES Beta Backer
- Posts: 1064
- Joined: Tue Jan 06, 2009 4:12 pm
- Programming Language of Choice: C/C++, Java
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
Since the performance is fine up until 1 million strings, I'm guessing you're running into memory problems, and the computer is starting to use (uber-slow) swap space. That, or the function stack is getting a little too deep and is slowing things down. I think it's a space complexity issue though.
Edit: taking a closer look at your implementation, I notice 2 things: first, that your method, run on a large data set, would create a HUGE amount of threads. And second, that these threads are... well, useless. If all you're doing is starting the threads then waiting for them to end, why not just have them be non-threaded methods? I think if you removed the threading you could actually make the algorithm significantly faster.
Edit: taking a closer look at your implementation, I notice 2 things: first, that your method, run on a large data set, would create a HUGE amount of threads. And second, that these threads are... well, useless. If all you're doing is starting the threads then waiting for them to end, why not just have them be non-threaded methods? I think if you removed the threading you could actually make the algorithm significantly faster.
Quit procrastinating and make something awesome.
Ducky wrote:Give a man some wood, he'll be warm for the night. Put him on fire and he'll be warm for the rest of his life.
- Light-Dark
- Dreamcast Developer
- Posts: 307
- Joined: Sun Mar 13, 2011 7:57 pm
- Current Project: 2D RPG & NES Platformer
- Favorite Gaming Platforms: NES,SNES,N64,Genesis,Dreamcast,PC,Xbox360
- Programming Language of Choice: C/++
- Location: Canada
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
uh-oh Java, well Java doesn't do well performance wise i at least have hope that you are also learning C/++ so you may be forgiven for your transgressions?
<tpw_rules> LightDark: java is a consequence of inverse moore's law: every 18 months, the average program will be twice as slow. therefore, computers always run at the same percevied speed. java's invention was a monumental step
- davidthefat
- Chaos Rift Maniac
- Posts: 529
- Joined: Mon Nov 10, 2008 3:51 pm
- Current Project: Fully Autonomous Robot
- Favorite Gaming Platforms: PS3
- Programming Language of Choice: C++
- Location: California
- Contact:
Re: My Multithreaded Merge Sort Algorithm takes 10+ minutes
If you have not noticed, it actually does not spawn more than 6 threads other than the main and read file threads (and other overhead threads java has, if it has any)Ginto8 wrote:Since the performance is fine up until 1 million strings, I'm guessing you're running into memory problems, and the computer is starting to use (uber-slow) swap space. That, or the function stack is getting a little too deep and is slowing things down. I think it's a space complexity issue though.
Edit: taking a closer look at your implementation, I notice 2 things: first, that your method, run on a large data set, would create a HUGE amount of threads. And second, that these threads are... well, useless. If all you're doing is starting the threads then waiting for them to end, why not just have them be non-threaded methods? I think if you removed the threading you could actually make the algorithm significantly faster.
Code: Select all
@Override
public void run()
{
mergesort(0, num - 1, 2); //This makes it only spawn 6 threads; 2 threads that calls two each.
}
Code: Select all
public static void mergesort(int l, int h, int i)
{
if (l < h)
{
int m = (l + h) / 2;
if(i > 0) //Checks if that "i" in the argument is bigger than 0. It recursively decrements it two lines below. The else statement does not generate any threads.
{
merger t1 = new merger(l, m, i - 1);
merger t2 = new merger(m + 1, h, i - 1);
t1.start();
t2.start();
try
{
t1.join();
t2.join();
}
catch(InterruptedException e)
{
System.out.println(e.getMessage());
}
}
else
{
mergesort(l, m, 0);
mergesort(m + 1, h, 0);
}
Do not worry; I know C++. In fact, the teacher almost did not let me take the class because I used C++ all year in his class two years ago... Just a fun fact. Why I am taking the class again is because he is teaching data structures and algorithms to people who already too the class.Light-Dark wrote:uh-oh Java, well Java doesn't do well performance wise i at least have hope that you are also learning C/++ so you may be forgiven for your transgressions?
edit: I'm a freaking idiot... I knew constructing the array of strings every iteration will **** me up... So I fixed it; it runs like a charm. 1 million strings now take ~790 ms... Fixed Code: http://pastebin.com/Jqf4TVEk