November 2013 – My Programming Notes

Monthly Archives: November 2013

C++ || Knapsack Problem Using Dynamic Programming

November 03, 2013 admin No comments

The program demonstrated on this page extends the previously implemented Exhaustive Search algorithm, this time using Dynamic Programming to find a solution to this problem.

Debian Package Overview	Select Show
Debian is a well-established GNU/Linux distribution. Ubuntu Linux, arguably the most popular Linux distribution, is based on Debian. Debian software is organized into packages. Each package corresponds to a software component, such as the GCC compiler or Firefox web browser. There are packages not only for the software that comprises the operating system, but also end-user applications. Nearly every piece of noteworthy open source software has a corresponding package. There are currently over 29,000 Debian source code packages available. This wide selection of packages is mostly a good thing. However it poses a problem for creating installation media – the floppy disks, CDs, DVDs, or flash drive images that administrators use to install the operating system. If a user wants to install a package that is missing from their installation media they need to download it over the internet, so it is beneficial to include as many packages as possible. It is impractical to include every package on those media, so the Debian project needs to select a small subset of important packages to include. Another part of Debian is the Popularity Contest, or popcon: The Popularity Contest tries to measure the popularity of each Debian package. Users may elect to participate, in which case the list of their installed packages is transmitted back to Debian. These lists are tallied as votes. So, thanks to popcon, we can get a vote tally for each package.

Debian Package Overview

Select Show

==== 1. THE PROBLEM ====

The Exhaustive Search algorithm ran in O(2n) time and was very slow.

The goal of this program is to implement a dynamic programming algorithm which runs in O(n2 • W) time, returning the optimal set of items/packages contained in the input file, which is fast enough to produce solutions to inputs of realistic size.

This program solves the following problem:

Input: A list of Debian packages, each with a size (in kilobytes) and number of votes, both of which are integers; and an integer W representing the capacity of the install media (in kilobytes).

Output: The set of packages with total size ≤ W , such that the total number of package votes is maximized.

In terms of the knapsack problem, the Debian package votes correspond to knapsack values, and our Debian package sizes correspond to knapsack weights.

==== 2. THE INPUT ====

The following file (knapsack_packages.txt) contains the name, size, and votes for approximately 36,000 binary Debian packages that are currently available on the amd64 platform, and have popcon information.

The file obeys the following format:

[ number of packages ] [ name 0 ] [ space ] [ votes 0 ] [ space ] [ size 0 ] [ newline ] [ name 1 ] [ space ] [ votes 1 ] [ space ] [ size 1 ] [ newline ] [ name 2 ] [ space ] [ votes 2 ] [ space ] [ size 2 ] [ newline ] ...

The packages are arranged within the file in sorted order, from the largest number of votes to the smallest.

==== 3. FLOW OF CONTROL ====

A test harness program is created which executes the above function and measures the elapsed time of the code corresponding to the algorithm in question. The test program will perform the following steps:

1. Load the "knapsack_packages.txt" file into memory.
2. Input size "n" and weight "W" as variables. 3. Execute the dynamic knapsack algorithm. The output of this algorithm should be the best combination sequence of packages.
4. Print the first 20 packages from the solution. A solution is the best sequence of packages, displaying the name of the package, the votes, and the size of each package, as well as the total votes and total size of the entire solution.

A knapsack problem instances is created of varying input sizes “n” by using the first “n” entries in the file knapsack_packages.txt. In other words, to create a problem instance with n = 100, only use the first 100 packages listed in the file as input.

==== 4. TEST HARNESS ====

Note: This program uses two external header files (Timer.h and Project2.h).
• Code for the Timer class (Timer.h) can be found here.
• Code for “Project2.h” can be found here.
• “Project5.h” is listed below.





		
		
			Dynamic Knapsack Test Harness
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 3, 2013
//    Taken From: http://programmingnotes.org/
//    File:  main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. 
//
//       This test program will reduce the problem of selecting which Debian
//       Linux packages to include on installation media, using the knapsack
//       problem. The goal is to implement a dynamic programming algorithm
//       for the knapsack problem, returning an optimal set of items/packages
//       which is fast enough to produce solutions to inputs of realistic size.
// =============================================================================
#include <iostream>
#include <cstdlib>
#include <cassert>
#include <fstream>
#include <vector>
#include "Timer.h"
#include "Project2.h"
#include "Project5.h"
using namespace std;

// the maximum weight
const int MAX_WEIGHT = 65536;

// the number of packages to examine
const int NUM_PACKAGES = 36149;

int main()
{
	// declare variables
	Timer timer;
	Project5 proj5;
	Project2 proj2;
	PackageStats access;
	ifstream infile;
	vector<PackageStats> packages;
	vector<PackageStats> bestCombo;	
	vector<int> knapResult;
	int* packageWeight = new int[NUM_PACKAGES];
	int* packageVotes = new int[NUM_PACKAGES];
	int totPackages = 0;

	// open file & make sure it exists
	infile.open("knapsack_packages.txt");
	if(infile.fail())
	{
		cout<<"nCant find file!n";
		exit(1);
	}

	// get the total number of packages from the file
	infile >> totPackages;	
	
	// make sure there are enough packages in the file
	assert(NUM_PACKAGES <= totPackages);
    
	// get the remaining info from the file
	//                      std::string        int             int    
	for(int x=0;(infile >> access.name >> access.votes >> access.size)
		&& x < NUM_PACKAGES; ++x)
	{
		packages.push_back(access);
		packageWeight[x] = access.size;
		packageVotes[x] = access.votes;
	}
	infile.close();
	
	// display stats
	cerr<<"n = "<<NUM_PACKAGES<<", W = "<<MAX_WEIGHT
		<<"nn-- Dynamic Search Solution --n";

	// start the timer
	timer.Start();
	
	// return a vector containing the array indexes
	// of the best knapsack package solution
	knapResult = proj5.DynamicKnapsack(packageWeight, packageVotes, NUM_PACKAGES, MAX_WEIGHT);
	
	// stop the timer
	timer.Stop();
	
	// using the data found from above, return
	// the packages that reside in those array indexes
	bestCombo = proj5.ReturnBest(packages, knapResult);
	
	// display info to the screen
	cout<<"n- Number of packages generated in this set: "<<bestCombo.size()
		<<"nn- First 20 packages..nn";		
	
	// display the best solution packages
	proj2.Display(bestCombo, 20);	
	
	// display the size and total votes
	cout<<"nTotal Size = "<<proj2.TotalSize(bestCombo)<<" -- Total Votes = "
		<<proj2.TotalVotes(bestCombo)<<endl;

	// display the elapsed time
	cout<<endl<<"It took "<<timer.Elapsed()*1000
		<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;

	delete[] packageWeight;
	delete[] packageVotes;
	return 0;
}// http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 3, 2013
//    Taken From: http://programmingnotes.org/
//    File:  main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. 
//
//       This test program will reduce the problem of selecting which Debian
//       Linux packages to include on installation media, using the knapsack
//       problem. The goal is to implement a dynamic programming algorithm
//       for the knapsack problem, returning an optimal set of items/packages
//       which is fast enough to produce solutions to inputs of realistic size.
// =============================================================================
#include <iostream>
#include <cstdlib>
#include <cassert>
#include <fstream>
#include <vector>
#include "Timer.h"
#include "Project2.h"
#include "Project5.h"
using namespace std;
 
// the maximum weight
const int MAX_WEIGHT = 65536;
 
// the number of packages to examine
const int NUM_PACKAGES = 36149;
 
int main()
{
	// declare variables
	Timer timer;
	Project5 proj5;
	Project2 proj2;
	PackageStats access;
	ifstream infile;
	vector<PackageStats> packages;
	vector<PackageStats> bestCombo;	
	vector<int> knapResult;
	int* packageWeight = new int[NUM_PACKAGES];
	int* packageVotes = new int[NUM_PACKAGES];
	int totPackages = 0;
 
	// open file & make sure it exists
	infile.open("knapsack_packages.txt");
	if(infile.fail())
	{
		cout<<"nCant find file!n";
		exit(1);
	}
 
	// get the total number of packages from the file
	infile >> totPackages;	
	
	// make sure there are enough packages in the file
	assert(NUM_PACKAGES <= totPackages);
    
	// get the remaining info from the file
	//                      std::string        int             int    
	for(int x=0;(infile >> access.name >> access.votes >> access.size)
		&& x < NUM_PACKAGES; ++x)
	{
		packages.push_back(access);
		packageWeight[x] = access.size;
		packageVotes[x] = access.votes;
	}
	infile.close();
	
	// display stats
	cerr<<"n = "<<NUM_PACKAGES<<", W = "<<MAX_WEIGHT
		<<"nn-- Dynamic Search Solution --n";
 
	// start the timer
	timer.Start();
	
	// return a vector containing the array indexes
	// of the best knapsack package solution
	knapResult = proj5.DynamicKnapsack(packageWeight, packageVotes, NUM_PACKAGES, MAX_WEIGHT);
	
	// stop the timer
	timer.Stop();
	
	// using the data found from above, return
	// the packages that reside in those array indexes
	bestCombo = proj5.ReturnBest(packages, knapResult);
	
	// display info to the screen
	cout<<"n- Number of packages generated in this set: "<<bestCombo.size()
		<<"nn- First 20 packages..nn";		
	
	// display the best solution packages
	proj2.Display(bestCombo, 20);	
	
	// display the size and total votes
	cout<<"nTotal Size = "<<proj2.TotalSize(bestCombo)<<" -- Total Votes = "
		<<proj2.TotalVotes(bestCombo)<<endl;
 
	// display the elapsed time
	cout<<endl<<"It took "<<timer.Elapsed()*1000
		<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;
 
	delete[] packageWeight;
	delete[] packageVotes;
	return 0;
}// http://programmingnotes.org/

==== 5. THE ALGORITHMS – “include Project5.h” ====





		
		
			Knapsack Problem Using Dynamic Programming
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 3, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project5.h
//    Description: This is a simple class which holds the functions for 
//       project 5
// =============================================================================
#ifndef PROJECT5_H
#define PROJECT5_H

#include <vector>
#include <algorithm>
#include "Project2.h"
using std::vector;

struct PackageSet
{
	int index;
	int weight;
	vector<int> appearances;
};

class Project5
{
public:
	Project5(){}

	vector<int> DynamicKnapsack(int packageWeight[], int packageVotes[],
		int numPackages, int maxWeight)
	{
		// declare variables
		vector<int> best;
		vector<int> temp;
		vector<PackageSet> pset;
		int** T = new int*[numPackages+1];
		int deleted = 0;
		int currWeight = maxWeight;
		bool dontCheck = false;		

		for(int x=0; x <= numPackages; ++x)
		{
			T[x] = new int[maxWeight+1];
			PackageSet access;
			dontCheck = false;
			temp.clear();
			
			for(int y=0; y <= maxWeight; ++y)
			{
				if(x==0 || y==0)
				{
					T[x][y] = 0;
				}
				else if(packageWeight[x-1] <= y)
				{
					T[x][y] = std::max(T[x-1][y-packageWeight[x-1]]
							+ packageVotes[x-1], T[x-1][y]);
					if(T[x][y] == (T[x-1][y-packageWeight[x-1]] 
							+ packageVotes[x-1]))
					{	
						// if we find a valid packet, place it into 
						// the temp vector for storage
						if(!dontCheck &&
							!(std::find(temp.begin(), temp.end(), 
							packageWeight[x-1]) != temp.end()))
						{
							temp.push_back(packageWeight[x-1]);							
						}		
						// find all of the weight instances where
						// this packet is "valid"
						access.appearances.push_back(y);
						dontCheck = true;
					}
				}
				else
				{
					T[x][y] = T[x-1][y]; 
				}
			}// end for loop

			// gather info about the packet we just found
			if((std::find(temp.begin(), temp.end(), 
					packageWeight[x-1]) != temp.end()))
			{			
				access.index = x-1;
				access.weight = packageWeight[x-1];
				pset.push_back(access);
			}
			
			// memory management, used to delete
			// array indexes thats no longer in use
			if(x > 1)
			{
				delete[] T[deleted++];
			}
		}// end for loop		

		delete[] T;
		
		// obtain the best possible knapsack solutuion, and save
		// their array indexes into a vector, starting from the end
		// NOTE: this places the knapsack solution in opposite (reverse) order
		for(int x = pset.size()-1; x >= 0; --x)
		{
			if(IsSolution(pset, x, currWeight))
			{
				best.push_back(pset.at(x).index);
				currWeight -= pset.at(x).weight;
			}
			pset.pop_back();
		}
		
		// reverse the vector back into ascending order
		std::reverse(best.begin(), best.end());

		return best;
	}// end of DynamicKnapsack

	bool IsSolution(vector<PackageSet>& pset, int index, int currWeight)
	{
		return std::find(pset.at(index).appearances.begin(), 
			pset.at(index).appearances.end(), currWeight) != 
			pset.at(index).appearances.end();
	}// end of IsSolution

	vector<PackageStats> ReturnBest(vector<PackageStats>& packages,
		vector<int>& knapResult)
	{
		vector<PackageStats> best;
		for(unsigned x=0; x < knapResult.size(); ++x)
		{
			best.push_back(packages.at(knapResult.at(x)));
		}
		return best;
	}// end of ReturnBest

	~Project5(){}
};
#endif // http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 3, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project5.h
//    Description: This is a simple class which holds the functions for 
//       project 5
// =============================================================================
#ifndef PROJECT5_H
#define PROJECT5_H
 
#include <vector>
#include <algorithm>
#include "Project2.h"
using std::vector;
 
struct PackageSet
{
	int index;
	int weight;
	vector<int> appearances;
};
 
class Project5
{
public:
	Project5(){}
 
	vector<int> DynamicKnapsack(int packageWeight[], int packageVotes[],
		int numPackages, int maxWeight)
	{
		// declare variables
		vector<int> best;
		vector<int> temp;
		vector<PackageSet> pset;
		int** T = new int*[numPackages+1];
		int deleted = 0;
		int currWeight = maxWeight;
		bool dontCheck = false;		
 
		for(int x=0; x <= numPackages; ++x)
		{
			T[x] = new int[maxWeight+1];
			PackageSet access;
			dontCheck = false;
			temp.clear();
			
			for(int y=0; y <= maxWeight; ++y)
			{
				if(x==0 || y==0)
				{
					T[x][y] = 0;
				}
				else if(packageWeight[x-1] <= y)
				{
					T[x][y] = std::max(T[x-1][y-packageWeight[x-1]]
							+ packageVotes[x-1], T[x-1][y]);
					if(T[x][y] == (T[x-1][y-packageWeight[x-1]] 
							+ packageVotes[x-1]))
					{	
						// if we find a valid packet, place it into 
						// the temp vector for storage
						if(!dontCheck &&
							!(std::find(temp.begin(), temp.end(), 
							packageWeight[x-1]) != temp.end()))
						{
							temp.push_back(packageWeight[x-1]);							
						}		
						// find all of the weight instances where
						// this packet is "valid"
						access.appearances.push_back(y);
						dontCheck = true;
					}
				}
				else
				{
					T[x][y] = T[x-1][y]; 
				}
			}// end for loop
 
			// gather info about the packet we just found
			if((std::find(temp.begin(), temp.end(), 
					packageWeight[x-1]) != temp.end()))
			{			
				access.index = x-1;
				access.weight = packageWeight[x-1];
				pset.push_back(access);
			}
			
			// memory management, used to delete
			// array indexes thats no longer in use
			if(x > 1)
			{
				delete[] T[deleted++];
			}
		}// end for loop		
 
		delete[] T;
		
		// obtain the best possible knapsack solutuion, and save
		// their array indexes into a vector, starting from the end
		// NOTE: this places the knapsack solution in opposite (reverse) order
		for(int x = pset.size()-1; x >= 0; --x)
		{
			if(IsSolution(pset, x, currWeight))
			{
				best.push_back(pset.at(x).index);
				currWeight -= pset.at(x).weight;
			}
			pset.pop_back();
		}
		
		// reverse the vector back into ascending order
		std::reverse(best.begin(), best.end());
 
		return best;
	}// end of DynamicKnapsack
 
	bool IsSolution(vector<PackageSet>& pset, int index, int currWeight)
	{
		return std::find(pset.at(index).appearances.begin(), 
			pset.at(index).appearances.end(), currWeight) != 
			pset.at(index).appearances.end();
	}// end of IsSolution
 
	vector<PackageStats> ReturnBest(vector<PackageStats>& packages,
		vector<int>& knapResult)
	{
		vector<PackageStats> best;
		for(unsigned x=0; x < knapResult.size(); ++x)
		{
			best.push_back(packages.at(knapResult.at(x)));
		}
		return best;
	}// end of ReturnBest
 
	~Project5(){}
};
#endif // http://programmingnotes.org/

QUICK NOTES:
The highlighted lines are sections of interest to look out for.

The code is heavily commented, so no further insight is necessary. If you have any questions, feel free to leave a comment below.

Note: Dont forget to include the input file!

The following is sample output:

=== RUN #1 ===
n = 36149, W = 65536 -- Dynamic Search Solution -- - Number of packages generated in this set: 764 - First 20 packages.. debianutils 90 128329 libgcc1 46 128327 debconf 168 121503 grep 595 121426 gzip 142 121346 sed 261 121220 findutils 805 121173 lsb-base 26 121121 sysv-rc 80 121092 sysvinit-utils 116 121078 base-files 65 121072 initscripts 93 121037 util-linux 846 121022 mount 272 120961 libselinux1 109 120918 cron 120 120872 base-passwd 49 120856 tzdata 488 120813 logrotate 64 120745 popularity-contest 67 120729 Total Size = 65536 -- Total Votes = 25250749 It took 33500 clicks (33.50 seconds) === RUN #2 === n = 36149, W = 800000 -- Dynamic Search Solution -- - Number of packages generated in this set: 4406 - First 20 packages.. debianutils 90 128329 libgcc1 46 128327 dpkg 2672 128294 perl-base 1969 127369 debconf 168 121503 grep 595 121426 gzip 142 121346 login 980 121332 coreutils 6505 121240 bash 1673 121229 sed 261 121220 findutils 805 121173 lsb-base 26 121121 sysv-rc 80 121092 sysvinit-utils 116 121078 base-files 65 121072 initscripts 93 121037 util-linux 846 121022 mount 272 120961 libselinux1 109 120918 Total Size = 800000 -- Total Votes = 41588693
It took 608760 clicks (608.76 seconds)

C++ || Decrease By Half Sorting Using Bubble Sort, Quick Sort, & Optimized Bubble Sort

November 02, 2013 admin No comments

The following is another homework assignment which was presented in an Algorithm Engineering class. Using a custom timer class, the following is a program which tries to improve upon the sorting code demonstrated in the initial Empirical Analysis.

The following program will execute two approaches: (1) implementing an algorithm with better asymptotic performance, and (2) tuning an existing algorithm.

==== 1. THE OBJECTIVE ====

The purpose of implementing this program is to obtain empirical results that answer the following questions:

• Are O(n log n) expected-time sorting algorithms, such as merge sort and quick sort, significantly faster than O(n2)-time algorithms in practice? • If so, by what margin? Is implementing a faster algorithm worth the effort? • Is it possible to get a O(n2)-time algorithm to beat a O(nlogn)-time algorithm by paying attention to implementation details? • If so, how much faster? Do you get better bang-for-the-buck by switching to an asymptotically-faster algorithm, or optimizing the same algorithm?

==== 2. THE ALGORITHMS ====

This program involves implementing and analyzing three algorithms:

1. Baseline: The O(n2) sorting algorithm implemented in Project 1. 2. Decrease-by-half: An O(n log n) algorithm (Quick Sort). 3. Optimized: A tuned, optimized version of the O(n2) baseline algorithm.

==== 3. FLOW OF CONTROL ====

A test harness program is created which executes the above functions and measures the elapsed time of the code corresponding to the algorithm in question. The test program will perform the following steps:

1. Input the value of n. Your code should treat n as a variable. 2. Create an array or vector of n random integers to serve as a problem instance. 3. Use a clock function to get the current time t1 . 4. Execute one algorithm (Bubble Sort, Quick sort, or Optimized Bubble Sort), using the array of random integers as input. 5. Use a clock function to get the current time t2 . 6. Output the elapsed time, t2 − t1 .

The test harness is configured in such a way to run all of the three algorithms, using a switch statement to change between the algorithms.

==== 4. TEST HARNESS ====

Note: This program uses two external header files (Timer.h and Project1.h).
• Code for the Timer class (Timer.h) can be found here.
• Code for “Project1.h” can be found here.
• “Project3.h” is listed below.





		
		
			Decrease By Half Test Harness
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. This program will try to improve the sorting methods
//       written in Project 1 by two approaches: 
//         (1) Implementing an algorithm with better asymptotic performance
//         (2) Tuning an existing algorithm. 
//       The sorting methods being reviewed are Bubble sort, Quick sort, and an 
//       optimized version of Bubble sort.
// =============================================================================
#include <iostream>
#include <cstdlib>
#include <ctime>
#include "Timer.h"
#include "Project1.h"
#include "Project3.h"
using namespace std;

// typedefs for the functions to use
enum Algorithms {bubble, quick, bubbleOptim};

// number of algs being tested
int NUM_ALGS = 3;

// default arry size
const int ARRAY_SIZE = 150000;

int main()
{
	// declare variables
	srand(time(NULL));
	int* arry = new int[ARRAY_SIZE];
	int seed = rand(); // generate a random seed to use for array on all algorithms
	Timer timer;
	Project1 proj1;
	Project3 proj3;
	
	// display the array size	
	cerr<<"nArray Size = "<<ARRAY_SIZE<<endl;	

	// loop to automatically execute the proj1 being tested
	for(int x=0; x < NUM_ALGS; ++x)
	{
	    cout<<"n----- STARTING ALGORITHM #"<<x+1<<" ----- nn";
	    	
		timer.Reset();
		
		// place data in the array
		proj1.Generate(arry, ARRAY_SIZE, seed);
		
		// display data in array if its within range		
		if(ARRAY_SIZE <= 25)
		{
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}

		// start the timer
		timer.Start();
		
		// determine which alg to execute
		switch(x)
		{
			case bubble:
				proj1.BubbleSort(arry, ARRAY_SIZE);
				break;
			case quick:
				proj3.QuickSort(arry, ARRAY_SIZE);
				break;
			case bubbleOptim:
				proj3.BubbleSort(arry, ARRAY_SIZE);
				break;
			default:
				cout<<"nThat option doesnt exist...n";
				exit(1);
				break;
		}
		
		// stop the timer
		timer.Stop();
		
		// display data in array if its within range
		if(ARRAY_SIZE <= 25)
		{
			cout<<endl;
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
		
		// display time
		cout<<endl<<"It took "<<timer.Elapsed()*1000
			<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;
		cout<<"n----- ALGORITHM #"<<x+1<<" DONE! ----- nn";
	}

	delete[] arry;
	return 0;
}// http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. This program will try to improve the sorting methods
//       written in Project 1 by two approaches: 
//         (1) Implementing an algorithm with better asymptotic performance
//         (2) Tuning an existing algorithm. 
//       The sorting methods being reviewed are Bubble sort, Quick sort, and an 
//       optimized version of Bubble sort.
// =============================================================================
#include <iostream>
#include <cstdlib>
#include <ctime>
#include "Timer.h"
#include "Project1.h"
#include "Project3.h"
using namespace std;
 
// typedefs for the functions to use
enum Algorithms {bubble, quick, bubbleOptim};
 
// number of algs being tested
int NUM_ALGS = 3;
 
// default arry size
const int ARRAY_SIZE = 150000;
 
int main()
{
	// declare variables
	srand(time(NULL));
	int* arry = new int[ARRAY_SIZE];
	int seed = rand(); // generate a random seed to use for array on all algorithms
	Timer timer;
	Project1 proj1;
	Project3 proj3;
	
	// display the array size	
	cerr<<"nArray Size = "<<ARRAY_SIZE<<endl;	
 
	// loop to automatically execute the proj1 being tested
	for(int x=0; x < NUM_ALGS; ++x)
	{
	    cout<<"n----- STARTING ALGORITHM #"<<x+1<<" ----- nn";
	    	
		timer.Reset();
		
		// place data in the array
		proj1.Generate(arry, ARRAY_SIZE, seed);
		
		// display data in array if its within range		
		if(ARRAY_SIZE <= 25)
		{
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
 
		// start the timer
		timer.Start();
		
		// determine which alg to execute
		switch(x)
		{
			case bubble:
				proj1.BubbleSort(arry, ARRAY_SIZE);
				break;
			case quick:
				proj3.QuickSort(arry, ARRAY_SIZE);
				break;
			case bubbleOptim:
				proj3.BubbleSort(arry, ARRAY_SIZE);
				break;
			default:
				cout<<"nThat option doesnt exist...n";
				exit(1);
				break;
		}
		
		// stop the timer
		timer.Stop();
		
		// display data in array if its within range
		if(ARRAY_SIZE <= 25)
		{
			cout<<endl;
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
		
		// display time
		cout<<endl<<"It took "<<timer.Elapsed()*1000
			<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;
		cout<<"n----- ALGORITHM #"<<x+1<<" DONE! ----- nn";
	}
 
	delete[] arry;
	return 0;
}// http://programmingnotes.org/

==== 5. THE ALGORITHMS – “include Project3.h” ====





		
		
			Bubble Sort, Quick Sort, & Optimized Bubble Sort
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project3.h
//    Description: This is a simple class which holds the functions for 
//       project 3
// =============================================================================
#ifndef PROJECT3_H
#define PROJECT3_H

#include <cstdlib>
#include <algorithm>

class Project3
{
public:
	Project3(){}
	// exchange unsorted elements with the last element
	// not the adjacent element
	void BubbleSort(int arry[], int size)
	{
		int last = size-1;
		do{
			for(int current=0; current < last; ++current) 
			{
				if(arry[current] > arry[last]) 
				{
					std::swap(arry[current], arry[last]);
				}
			}
			--last;
		}while(last >= 0);
	}// end of BubbleSort

	void QuickSort(int arry[], int size) 
	{
		if(size > 1) 
		{
			// choose pivot
			int pivotIndex = rand()%(size-1); 
			// partition the arry and get the new pivot position
			int newPiviotIndex = Partition(arry, size, pivotIndex); 
			// quick sort the first part
			QuickSort(arry, newPiviotIndex); 
			// quick sort the second part
			QuickSort(arry+newPiviotIndex+1, size-newPiviotIndex-1); 
		}
	}// end of QuickSort

	int Partition(int arry[], int size, int pivotIndex) 
	{
		int pivotValue = arry[pivotIndex];
		arry[pivotIndex] = arry[size-1]; // swap pivot with last element
		arry[size-1] = pivotValue;
		int left = 0; // left index
		int right = size-2; // right index
	
		while(left < right) 
		{ // ( < pivot ), pivot, ( >= pivot)
			while((arry[left] < pivotValue) && (left < right))
			{
				++left;
			}
			while((arry[right] >= pivotValue) && (left < right))
			{
				--right;
			}
			if(left < right) 
			{
				std::swap(arry[left], arry[right]);
				++left;
				--right;
			}
		}
		if(left == right)
		{
			if(arry[left] < pivotValue)
			{
				++left;
			}
		}
		arry[size-1] = arry[left]; // move pivot to its final place
		arry[left] = pivotValue;
		return left; // return the position of the pivot
	}// end of Partition
	
	~Project3(){}
};
#endif // http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project3.h
//    Description: This is a simple class which holds the functions for 
//       project 3
// =============================================================================
#ifndef PROJECT3_H
#define PROJECT3_H
 
#include <cstdlib>
#include <algorithm>
 
class Project3
{
public:
	Project3(){}
	// exchange unsorted elements with the last element
	// not the adjacent element
	void BubbleSort(int arry[], int size)
	{
		int last = size-1;
		do{
			for(int current=0; current < last; ++current) 
			{
				if(arry[current] > arry[last]) 
				{
					std::swap(arry[current], arry[last]);
				}
			}
			--last;
		}while(last >= 0);
	}// end of BubbleSort
 
	void QuickSort(int arry[], int size) 
	{
		if(size > 1) 
		{
			// choose pivot
			int pivotIndex = rand()%(size-1); 
			// partition the arry and get the new pivot position
			int newPiviotIndex = Partition(arry, size, pivotIndex); 
			// quick sort the first part
			QuickSort(arry, newPiviotIndex); 
			// quick sort the second part
			QuickSort(arry+newPiviotIndex+1, size-newPiviotIndex-1); 
		}
	}// end of QuickSort
 
	int Partition(int arry[], int size, int pivotIndex) 
	{
		int pivotValue = arry[pivotIndex];
		arry[pivotIndex] = arry[size-1]; // swap pivot with last element
		arry[size-1] = pivotValue;
		int left = 0; // left index
		int right = size-2; // right index
	
		while(left < right) 
		{ // ( < pivot ), pivot, ( >= pivot)
			while((arry[left] < pivotValue) && (left < right))
			{
				++left;
			}
			while((arry[right] >= pivotValue) && (left < right))
			{
				--right;
			}
			if(left < right) 
			{
				std::swap(arry[left], arry[right]);
				++left;
				--right;
			}
		}
		if(left == right)
		{
			if(arry[left] < pivotValue)
			{
				++left;
			}
		}
		arry[size-1] = arry[left]; // move pivot to its final place
		arry[left] = pivotValue;
		return left; // return the position of the pivot
	}// end of Partition
	
	~Project3(){}
};
#endif // http://programmingnotes.org/

QUICK NOTES:
The highlighted lines are sections of interest to look out for.

The code is heavily commented, so no further insight is necessary. If you have any questions, feel free to leave a comment below.

Note: This page presents sample code for the above problem, but scatter plots will not be provided.

The following is sample output:

Array Size = 150000
----- STARTING ALGORITHM #1 ----- It took 248290 clicks (248.29 seconds) ----- ALGORITHM #1 DONE! ----- ----- STARTING ALGORITHM #2 ----- It took 50 clicks (0.05 seconds) ----- ALGORITHM #2 DONE! ----- ----- STARTING ALGORITHM #3 ----- It took 164300 clicks (164.3 seconds)
----- ALGORITHM #3 DONE! -----

C++ || Knapsack Problem Using Exhaustive Search

November 02, 2013 admin No comments

The following is another homework assignment which was presented in an Algorithm Engineering class. Using a custom timer class, the following is a program which reduces the problem of selecting which Debian Linux packages to include on installation media, to the classical knapsack problem. This program implements an exhaustive search algorithm for this problem, and displays its performance running time to the screen.

NOTE: Looking for the Dynamic Programming version to this problem? Click here!

==== 1. OVERVIEW ====

Debian is a well-established GNU/Linux distribution. Ubuntu Linux, arguably the most popular Linux distribution, is based on Debian. Debian software is organized into packages. Each package corresponds to a software component, such as the GCC compiler or Firefox web browser. There are packages not only for the software that comprises the operating system, but also end-user applications. Nearly every piece of noteworthy open source software has a corresponding package. There are currently over 29,000 Debian source code packages available.

This wide selection of packages is mostly a good thing. However it poses a problem for creating installation media – the floppy disks, CDs, DVDs, or flash drive images that administrators use to install the operating system. If a user wants to install a package that is missing from their installation media they need to download it over the internet, so it is beneficial to include as many packages as possible. It is impractical to include every package on those media, so the Debian project needs to select a small subset of important packages to include.

Another part of Debian is the Popularity Contest, or popcon:

The Popularity Contest tries to measure the popularity of each Debian package. Users may elect to participate, in which case the list of their installed packages is transmitted back to Debian. These lists are tallied as votes. So, thanks to popcon, we can get a vote tally for each package.

==== 2. THE PROBLEM ====

This program solves the following problem:

Output: The set of packages with total size ≤ W , such that the total number of package votes is maximized.

In terms of the knapsack problem, the Debian package votes correspond to knapsack values, and our Debian package sizes correspond to knapsack weights.

==== 3. THE INPUT ====

The file obeys the following format:

[ number of packages ] [ name 0 ] [ space ] [ votes 0 ] [ space ] [ size 0 ] [ newline ] [ name 1 ] [ space ] [ votes 1 ] [ space ] [ size 1 ] [ newline ] [ name 2 ] [ space ] [ votes 2 ] [ space ] [ size 2 ] [ newline ] ...

The packages are arranged within the file in sorted order, from the largest number of votes to the smallest.

==== 4. FLOW OF CONTROL ====

1. Load the "knapsack_packages.txt" file into memory. 2. Input size "n" and weight "W" as variables. 3. Execute the exhaustive search knapsack algorithm. The output of this algorithm should be the best combination sequence of packages. 4. Print the solution. A solution is the best sequence of packages, displaying the name of the package, the votes, and the size of each package, as well as the total votes and total size of the entire solution.

==== 5. TEST HARNESS ====

Note: This program uses a custom Timer class (Timer.h). To obtain code for that class, click here.

“Project2.h” is listed below.





		
		
			Exhaustive Knapsack Test Harness
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. This test program will reduce the problem of selecting 
//       which Debian Linux packages to include on installation media, using the 
//       knapsack problem. An exhaustive search algorithm is used, and a measure 
//       of its performance is recorded.
// =============================================================================
#include <iostream>
#include <cstdlib>
#include <cassert>
#include <fstream>
#include <string>
#include <vector>
#include "Timer.h"
#include "Project2.h"
using namespace std;

// the maximum weight
const int MAX_WEIGHT = 65536;

// the number of packages to examine
const int NUM_PACKAGES = 24;

int main()
{
	// declare variables
	Timer timer;
	Project2 proj2;
	ifstream infile;
	PackageStats access;
	vector<PackageStats> packages;
	vector<PackageStats> bestCombo;	
	int totPackages = 0;

	// open file & make sure it exists
	infile.open("knapsack_packages.txt");
	if(infile.fail())
	{
		cerr<<"nCant find file!n";
		exit(1);
	}

	// get the total number of packages from the file
	infile >> totPackages;	
	
	// make sure there are enough packages in the file
	assert(NUM_PACKAGES <= totPackages);
	
	// display stats
	cerr<<"n = "<<NUM_PACKAGES<<", W = "<<MAX_WEIGHT
		<<"nn-- Exhaustive Search Solution --nn";	
	
	// get the remaining info from the file
	//                      std::string        int             int
	for(int x=0; (infile >> access.name >> access.votes >> access.size)
		&& x < NUM_PACKAGES; ++x)
	{
		packages.push_back(access);
	}
	infile.close();

	// start the timer
	timer.Start();
	
	// find the best knapsack subset solution
	bestCombo = proj2.ExhaustiveKnapsack(packages, MAX_WEIGHT);
	
	// stop the timer
	timer.Stop();

	// display the best solution packages
	proj2.Display(bestCombo, bestCombo.size());

	// display the size and total votes
	cout<<"nTotal Size = "<<proj2.TotalSize(bestCombo)<<" -- Total Votes = "
		<<proj2.TotalVotes(bestCombo)<<endl;

	// display the elapsed time
	cout<<"nIt took "<<timer.Elapsed()*1000
		<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;

	return 0;
}// http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. This test program will reduce the problem of selecting 
//       which Debian Linux packages to include on installation media, using the 
//       knapsack problem. An exhaustive search algorithm is used, and a measure 
//       of its performance is recorded.
// =============================================================================
#include <iostream>
#include <cstdlib>
#include <cassert>
#include <fstream>
#include <string>
#include <vector>
#include "Timer.h"
#include "Project2.h"
using namespace std;
 
// the maximum weight
const int MAX_WEIGHT = 65536;
 
// the number of packages to examine
const int NUM_PACKAGES = 24;
 
int main()
{
	// declare variables
	Timer timer;
	Project2 proj2;
	ifstream infile;
	PackageStats access;
	vector<PackageStats> packages;
	vector<PackageStats> bestCombo;	
	int totPackages = 0;
 
	// open file & make sure it exists
	infile.open("knapsack_packages.txt");
	if(infile.fail())
	{
		cerr<<"nCant find file!n";
		exit(1);
	}
 
	// get the total number of packages from the file
	infile >> totPackages;	
	
	// make sure there are enough packages in the file
	assert(NUM_PACKAGES <= totPackages);
	
	// display stats
	cerr<<"n = "<<NUM_PACKAGES<<", W = "<<MAX_WEIGHT
		<<"nn-- Exhaustive Search Solution --nn";	
	
	// get the remaining info from the file
	//                      std::string        int             int
	for(int x=0; (infile >> access.name >> access.votes >> access.size)
		&& x < NUM_PACKAGES; ++x)
	{
		packages.push_back(access);
	}
	infile.close();
 
	// start the timer
	timer.Start();
	
	// find the best knapsack subset solution
	bestCombo = proj2.ExhaustiveKnapsack(packages, MAX_WEIGHT);
	
	// stop the timer
	timer.Stop();
 
	// display the best solution packages
	proj2.Display(bestCombo, bestCombo.size());
 
	// display the size and total votes
	cout<<"nTotal Size = "<<proj2.TotalSize(bestCombo)<<" -- Total Votes = "
		<<proj2.TotalVotes(bestCombo)<<endl;
 
	// display the elapsed time
	cout<<"nIt took "<<timer.Elapsed()*1000
		<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;
 
	return 0;
}// http://programmingnotes.org/

==== 6. THE ALGORITHMS – “include Project2.h” ====





		
		
			Knapsack Problem Using Exhaustive Search
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project2.h
//    Description: This is a simple class which holds the functions for 
//       project 2
// =============================================================================
#ifndef PROJECT2_H
#define PROJECT2_H

#include <iostream>
#include <vector>
#include <string>
using std::vector;
using std::string;

// structure to hold the items contained in the file
struct PackageStats
{
	string name;
	int votes;
	int size;
};

class Project2
{
public:
	Project2(){}
	vector<PackageStats> ExhaustiveKnapsack(vector<PackageStats>& packages, int weight)
	{	
		vector<PackageStats> best;	
		int* subsets = new int[packages.size()];		
		int bestVote = 0;

		// generate subsets (2^n possibilities)
		for(unsigned x=0; x < (unsigned)(1 << (int)packages.size()); ++x)
		{	
			int index = 0;	
			int totalSize = 0;
			int totalVotes = 0;
			
			// generate subsets using binary digits
			for(unsigned y=0; y < packages.size(); ++y)
			{
				if(((x >> y) & 1) == 1)
				{
					subsets[index++] = y;
					totalSize += packages.at(y).size;
					totalVotes += packages.at(y).votes;
				}
			}
			// find the best combination of subsets
			if((totalSize <= weight) && (best.empty() || (totalVotes > bestVote)))
			{	
				bestVote = totalVotes;
				best = ReturnBest(packages, subsets, index);
			}			
		}
		delete[] subsets;
		return best;
	}// end of ExhaustiveKnapsack

	int TotalSize(vector<PackageStats>& packages, vector<int> s = vector<int>())
	{
		int total = 0;

		// if theres 2 parameters
		if(!s.empty())
		{
			for(unsigned x=0; x < s.size(); ++x)
			{
				total += packages.at(s.at(x)).size;
			}
		}
		else // if theres only 1
		{
			for(unsigned x=0; x < packages.size(); ++x)
			{
				total += packages.at(x).size;
			}
		}
		return total;
	}// end of TotalSize

	int TotalVotes(vector<PackageStats>& packages, vector<int> s = vector<int>())
	{
		int total = 0;

		// if theres 2 parameters
		if(!s.empty())
		{
			for(unsigned x=0; x < s.size(); ++x)
			{
				total += packages.at(s.at(x)).votes;
			}
		}
		else // if theres only 1
		{
			for(unsigned x=0; x < packages.size(); ++x)
			{
				total += packages.at(x).votes;
			}
		}
		return total;
	}// end of TotalVotes

	vector<PackageStats> ReturnBest(vector<PackageStats>& packages, int subsets[], int size)
	{
		vector<PackageStats> best;
		for(int x=0; x < size; ++x)
		{
			best.push_back(packages.at(subsets[x]));
		}
		return best;
	}// end of ReturnBest
	
	void Display(vector<PackageStats>& packages, unsigned size)
	{
		for(unsigned x=0; x < size && x < packages.size(); ++x)
		{
			std::cout<<packages.at(x).name<<" "
			     <<packages.at(x).size<<" "<<packages.at(x).votes<<std::endl;
		}
	}// end of Display	
	~Project2(){}
};
#endif // http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 2, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project2.h
//    Description: This is a simple class which holds the functions for 
//       project 2
// =============================================================================
#ifndef PROJECT2_H
#define PROJECT2_H
 
#include <iostream>
#include <vector>
#include <string>
using std::vector;
using std::string;
 
// structure to hold the items contained in the file
struct PackageStats
{
	string name;
	int votes;
	int size;
};
 
class Project2
{
public:
	Project2(){}
	vector<PackageStats> ExhaustiveKnapsack(vector<PackageStats>& packages, int weight)
	{	
		vector<PackageStats> best;	
		int* subsets = new int[packages.size()];		
		int bestVote = 0;
 
		// generate subsets (2^n possibilities)
		for(unsigned x=0; x < (unsigned)(1 << (int)packages.size()); ++x)
		{	
			int index = 0;	
			int totalSize = 0;
			int totalVotes = 0;
			
			// generate subsets using binary digits
			for(unsigned y=0; y < packages.size(); ++y)
			{
				if(((x >> y) & 1) == 1)
				{
					subsets[index++] = y;
					totalSize += packages.at(y).size;
					totalVotes += packages.at(y).votes;
				}
			}
			// find the best combination of subsets
			if((totalSize <= weight) && (best.empty() || (totalVotes > bestVote)))
			{	
				bestVote = totalVotes;
				best = ReturnBest(packages, subsets, index);
			}			
		}
		delete[] subsets;
		return best;
	}// end of ExhaustiveKnapsack
 
	int TotalSize(vector<PackageStats>& packages, vector<int> s = vector<int>())
	{
		int total = 0;
 
		// if theres 2 parameters
		if(!s.empty())
		{
			for(unsigned x=0; x < s.size(); ++x)
			{
				total += packages.at(s.at(x)).size;
			}
		}
		else // if theres only 1
		{
			for(unsigned x=0; x < packages.size(); ++x)
			{
				total += packages.at(x).size;
			}
		}
		return total;
	}// end of TotalSize
 
	int TotalVotes(vector<PackageStats>& packages, vector<int> s = vector<int>())
	{
		int total = 0;
 
		// if theres 2 parameters
		if(!s.empty())
		{
			for(unsigned x=0; x < s.size(); ++x)
			{
				total += packages.at(s.at(x)).votes;
			}
		}
		else // if theres only 1
		{
			for(unsigned x=0; x < packages.size(); ++x)
			{
				total += packages.at(x).votes;
			}
		}
		return total;
	}// end of TotalVotes
 
	vector<PackageStats> ReturnBest(vector<PackageStats>& packages, int subsets[], int size)
	{
		vector<PackageStats> best;
		for(int x=0; x < size; ++x)
		{
			best.push_back(packages.at(subsets[x]));
		}
		return best;
	}// end of ReturnBest
	
	void Display(vector<PackageStats>& packages, unsigned size)
	{
		for(unsigned x=0; x < size && x < packages.size(); ++x)
		{
			std::cout<<packages.at(x).name<<" "
			     <<packages.at(x).size<<" "<<packages.at(x).votes<<std::endl;
		}
	}// end of Display	
	~Project2(){}
};
#endif // http://programmingnotes.org/

QUICK NOTES:
The highlighted lines are sections of interest to look out for.

NOTE: Looking for the Dynamic Programming version to this problem? Click here!

The code is heavily commented, so no further insight is necessary. If you have any questions, feel free to leave a comment below.

Note: Dont forget to include the input file!

The following is sample output:

n = 24, W = 65536
-- Exhaustive Search Solution -- debianutils 90 128329 libgcc1 46 128327 dpkg 2672 128294 perl-base 1969 127369 debconf 168 121503 grep 595 121426 gzip 142 121346 login 980 121332 coreutils 6505 121240 bash 1673 121229 sed 261 121220 findutils 805 121173 lsb-base 26 121121 sysv-rc 80 121092 sysvinit-utils 116 121078 base-files 65 121072 initscripts 93 121037 util-linux 846 121022 mount 272 120961 libselinux1 109 120918 cron 120 120872 base-passwd 49 120856 apt 1356 120826 tzdata 488 120813 Total Size = 19526 -- Total Votes = 2934456
It took 19140 clicks (19.14 seconds)

C++ || Empirical Analysis Using Min Element, Bubble Sort, & Selection Sort

November 01, 2013 admin No comments

The following is another homework assignment which was presented in an Algorithm Engineering class. Using a custom timer class, the following is a program which performs an empirical analysis of three non recursive algorithms. This program implements the algorithms and displays their performance running time to the screen.

The algorithms being examined are: MinElement, which finds the smallest element an array. Bubble Sort, and Selection Sort.

==== 1. ASYMPTOTIC ANALYSIS ====

Selection sort and Bubble sort both run in O(n2) time. MinElement runs in O(n) time. The empirical analysis implemented in this program should agree with the above asymptotic bounds, but sometimes experiments surprise us.

==== 2. EMPIRICAL ANALYSIS ====

To analyze the three algorithms empirically the elapsed running time (in seconds) should be measured for various values of array sizes “n.” These results should be graphed on a scatter plot, which will then help to infer which complexity class the plot corresponds to. The asymptotic analysis above says that we should expect these graphs to resemble linear or quadratic curves.

Timing code for empirical analysis takes some care. It is important to measure the elapsed time of only the code for the algorithm itself, and not other steps such as loading input files or printing output. Also, since computer code executes very rapidly, it is important to measure time in small fractions of seconds.

==== 3. WHAT TO MEASURE ====

The goal is to draw a scatter plot graph for each algorithm’s running times (a total of three plots). Each plot needs to have enough data points to interpolate a fitting curve; 5 is the smallest number that might be reasonable.

So each algorithm should be ran for at least 5 different values of size “n.” At least one very small value of n (less than 10) should be included, and one big value that’s large enough to make the code run for at least 5 minutes should be used. Once the data is graphed, the curve should resemble the appropriate asymptotic bounds for the function being examined.

Note: This page will present sample code for the above problem, but scatter plots will not be provided.

==== 4. FLOW OF CONTROL ====

1. Input the value of n. Your code should treat n as a variable. 2. Create an array or vector of n random integers to serve as a problem instance. 3. Use a clock function to get the current time t1 . 4. Execute one algorithm (MinElement, bubble sort, or insertion sort), using the array of random integers as input. 5. Use a clock function to get the current time t2 . 6. Output the elapsed time, t2 − t1 .

The test harness is configured in such a way to run all of the three algorithms, using a switch statement to change between the algorithms.

==== 5. TEST HARNESS ====

Note: This program uses a custom Timer class (Timer.h). To obtain code for that class, click here.

“Project1.h” is listed below.





		
		
			Empirical Analysis Test Harness
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 1, 2013
//    Taken From: http://programmingnotes.org/
//    File: main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. This test program performs the following operations: 
//         1. Input a value of n.
//         2. Create an array of n random integers to serve as a problem instance
//         3. Use a clock function to get the current time t1.
//         4. Execute one algorithm (MinElement, bubble sort, insertion sort), 
//             using the array of random integers as input.
//         5. Use a clock function to get the current time t2.
//         6. Output the elapsed time, t2 - t1.
// =============================================================================
#include <iostream>
#include <ctime>
#include <cstdlib>
#include "Timer.h"
#include "Project1.h"
using namespace std;

// typedefs for the functions to use
enum Algorithms {minElem, bubble, selection};

// the total number of algs to execute
const int NUM_ALGS = 3;

// default array size
const int ARRAY_SIZE = 20000;

int main()
{
	srand(time(NULL));
	int* arry = new int[ARRAY_SIZE];
	int min = 0;
	int seed = rand(); // generate a random seed to use for array on all algorithms		
	Timer timer;
	Project1 proj1;
	
	// display the array size	
	cerr<<"nArray Size = "<<ARRAY_SIZE<<endl;

	// loop to automatically execute the alg being tested
	for(int x=0; x < NUM_ALGS; ++x)
	{
	    cout<<"n----- STARTING ALGORITHM #"<<x+1<<" ----- nn";
	    	
		timer.Reset();
		
		// place data into the array
		proj1.Generate(arry, ARRAY_SIZE, seed);
		
		// display data in array if its within range
		if(ARRAY_SIZE <= 25)
		{
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
		
		// start the timer
		timer.Start();
		
		// determine which alg to execute
		switch(x)
		{
    		case minElem:
    			min = proj1.MinElement(arry, ARRAY_SIZE);
    			cout<<"nMin = "<<min<<endl;
    			break;
    		case selection:
    			proj1.SelectionSort(arry, ARRAY_SIZE);
    			break;
    		case bubble:
    			proj1.BubbleSort(arry, ARRAY_SIZE);
    			break;
    		default:
    			cout<<"nThat option doesnt exist...n";
    			exit(1);
    			break;
		}

		// stop the timer
		timer.Stop();
		
		// display data in array if its within range		
		if(ARRAY_SIZE <= 25)
		{
			cout<<endl;
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
		
		// display total elapsed time
		cout<<endl<<"It took "<<timer.Elapsed()*1000
			<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;
		cout<<"n----- ALGORITHM #"<<x+1<<" DONE! ----- nn";
	}
	
	return 0;
}// http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 1, 2013
//    Taken From: http://programmingnotes.org/
//    File: main.cpp
//    Description: This is the test harness program which runs the code and 
//       measures the elapsed time of the code corresponding to the algorithm 
//       in question. This test program performs the following operations: 
//         1. Input a value of n.
//         2. Create an array of n random integers to serve as a problem instance
//         3. Use a clock function to get the current time t1.
//         4. Execute one algorithm (MinElement, bubble sort, insertion sort), 
//             using the array of random integers as input.
//         5. Use a clock function to get the current time t2.
//         6. Output the elapsed time, t2 - t1.
// =============================================================================
#include <iostream>
#include <ctime>
#include <cstdlib>
#include "Timer.h"
#include "Project1.h"
using namespace std;
 
// typedefs for the functions to use
enum Algorithms {minElem, bubble, selection};
 
// the total number of algs to execute
const int NUM_ALGS = 3;
 
// default array size
const int ARRAY_SIZE = 20000;
 
int main()
{
	srand(time(NULL));
	int* arry = new int[ARRAY_SIZE];
	int min = 0;
	int seed = rand(); // generate a random seed to use for array on all algorithms		
	Timer timer;
	Project1 proj1;
	
	// display the array size	
	cerr<<"nArray Size = "<<ARRAY_SIZE<<endl;
 
	// loop to automatically execute the alg being tested
	for(int x=0; x < NUM_ALGS; ++x)
	{
	    cout<<"n----- STARTING ALGORITHM #"<<x+1<<" ----- nn";
	    	
		timer.Reset();
		
		// place data into the array
		proj1.Generate(arry, ARRAY_SIZE, seed);
		
		// display data in array if its within range
		if(ARRAY_SIZE <= 25)
		{
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
		
		// start the timer
		timer.Start();
		
		// determine which alg to execute
		switch(x)
		{
    		case minElem:
    			min = proj1.MinElement(arry, ARRAY_SIZE);
    			cout<<"nMin = "<<min<<endl;
    			break;
    		case selection:
    			proj1.SelectionSort(arry, ARRAY_SIZE);
    			break;
    		case bubble:
    			proj1.BubbleSort(arry, ARRAY_SIZE);
    			break;
    		default:
    			cout<<"nThat option doesnt exist...n";
    			exit(1);
    			break;
		}
 
		// stop the timer
		timer.Stop();
		
		// display data in array if its within range		
		if(ARRAY_SIZE <= 25)
		{
			cout<<endl;
			proj1.Display(arry, ARRAY_SIZE);
			cout<<endl;
		}
		
		// display total elapsed time
		cout<<endl<<"It took "<<timer.Elapsed()*1000
			<<" clicks ("<<timer.Elapsed()<<" seconds)"<<endl;
		cout<<"n----- ALGORITHM #"<<x+1<<" DONE! ----- nn";
	}
	
	return 0;
}// http://programmingnotes.org/

==== 6. THE ALGORITHMS – “include Project1.h” ====





		
		
			MinElement, BubbleSort, SelectionSort
			
C++
			
			// =============================================================================
//    Author: K Perkins
//    Date:   Nov 1, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project1.h
//    Description: This is a simple class which holds the functions for 
//       project 1 
// =============================================================================
#ifndef PROJECT1_H
#define PROJECT1_H

#include <iostream>
#include <cassert>
#include <cstdlib>
#include <algorithm>

class Project1
{
public:
	Project1(){}
	int MinElement(int arry[], int size)
	{
		assert(size > 0);
		int min = arry[0];
		for(int x=1; x < size; ++x)
		{
			if(arry[x] < min)
			{
				min = arry[x];
			}
		}
		return min;
	}// end of MinElement

	void SelectionSort(int arry[], int size)
	{
		for(int x=0; x <= size-2; ++x)
		{
			int min = x;
			for(int y=x+1; y <= size-1; ++y)
			{
				if(arry[y] < arry[min])
				{
					min = y;
				}
			}
			std::swap(arry[x], arry[min]);
		}
	}// end of SelectionSort

	void BubbleSort(int arry[], int size)
	{
		for(int x=0; x <= size-2; ++x)
		{
			for(int y=0; y <= (size-2)-x; ++y)
			{
				if(arry[y+1] < arry[y])
				{
					std::swap(arry[y], arry[y+1]);
				}
			}
		}
	}// end of BubbleSort
	
	void Generate(int arry[], int size, int seed)
	{
		srand(seed);
		for(int x=0; x < size; ++x)
		{
			arry[x] = rand() % 7281987;
		}		
	}// end of Generate

	void Display(int arry[], int size)
	{
		for(int x=0; x < size; ++x)
		{
			std::cout<<arry[x]<<" ";
		}
	}// end of Display	
	~Project1(){}
};
#endif // http://programmingnotes.org/
			
				
					
				
					1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
				
						// =============================================================================
//    Author: K Perkins
//    Date:   Nov 1, 2013
//    Taken From: http://programmingnotes.org/
//    File:  Project1.h
//    Description: This is a simple class which holds the functions for 
//       project 1 
// =============================================================================
#ifndef PROJECT1_H
#define PROJECT1_H
 
#include <iostream>
#include <cassert>
#include <cstdlib>
#include <algorithm>
 
class Project1
{
public:
	Project1(){}
	int MinElement(int arry[], int size)
	{
		assert(size > 0);
		int min = arry[0];
		for(int x=1; x < size; ++x)
		{
			if(arry[x] < min)
			{
				min = arry[x];
			}
		}
		return min;
	}// end of MinElement
 
	void SelectionSort(int arry[], int size)
	{
		for(int x=0; x <= size-2; ++x)
		{
			int min = x;
			for(int y=x+1; y <= size-1; ++y)
			{
				if(arry[y] < arry[min])
				{
					min = y;
				}
			}
			std::swap(arry[x], arry[min]);
		}
	}// end of SelectionSort
 
	void BubbleSort(int arry[], int size)
	{
		for(int x=0; x <= size-2; ++x)
		{
			for(int y=0; y <= (size-2)-x; ++y)
			{
				if(arry[y+1] < arry[y])
				{
					std::swap(arry[y], arry[y+1]);
				}
			}
		}
	}// end of BubbleSort
	
	void Generate(int arry[], int size, int seed)
	{
		srand(seed);
		for(int x=0; x < size; ++x)
		{
			arry[x] = rand() % 7281987;
		}		
	}// end of Generate
 
	void Display(int arry[], int size)
	{
		for(int x=0; x < size; ++x)
		{
			std::cout<<arry[x]<<" ";
		}
	}// end of Display	
	~Project1(){}
};
#endif // http://programmingnotes.org/

QUICK NOTES:
The highlighted lines are sections of interest to look out for.

The code is heavily commented, so no further insight is necessary. If you have any questions, feel free to leave a comment below.

Note: This page presents sample code for the above problem, but scatter plots will not be provided.

The following is sample output:

Array Size = 20000
----- STARTING ALGORITHM #1 ----- Min = 2 It took 0 clicks (0 seconds) ----- ALGORITHM #1 DONE! ----- ----- STARTING ALGORITHM #2 ----- It took 4350 clicks (4.35 seconds) ----- ALGORITHM #2 DONE! ----- ----- STARTING ALGORITHM #3 ----- It took 2150 clicks (2.15 seconds)
----- ALGORITHM #3 DONE! -----

Monthly Archives: November 2013

C++ || Knapsack Problem Using Dynamic Programming

C++ || Decrease By Half Sorting Using Bubble Sort, Quick Sort, & Optimized Bubble Sort

C++ || Knapsack Problem Using Exhaustive Search

C++ || Empirical Analysis Using Min Element, Bubble Sort, & Selection Sort

Search

Compile Your Code Now

Translate Page

Popular Pages

Recent

Archives

Categories