Project #2: Schelling’s Model of Housing Segregation (Cont.)

Due: Wednesday, August 7th, 11:59pm CDT

The goal of this project is to continue implementing additional features from your previous project along with performing timing tests.

Peanut Cluster and GPU Partitions

If you are not familiar with submitting jobs to a cluster (i.e., the Peanut cluster) please make sure to watch the following video and read over the SLURM documentation provided by our tech-staff before beginning the assignment:

We will grade all assignments using the Peanut cluster’s GPUs and all programming assignments must work correctly on one of these machines. If you have a NVIDIA GPU installed on your local machine then you can work; however, your program must still work on one of the cluster’s GPU nodes, no exceptions.

You have the option batching your programs to the following cluster nodes

  • PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST

  • gpu-all      up    4:00:00      2   idle gpu[2-3]

  • titan        up    4:00:00      1   idle gpu3

  • pascal       up    4:00:00      1   idle gpu2

  • quadro       up    4:00:00      1   idle gpu1

Please be cautious of using titan since it’s been known to have stalling problems so we advise not using it until it’s been reliably fixed.

Creating Your Private Repository

To actually get your private repository, you will need this invitation URL:

  • Project 2 invitation (Please check the Post “Project 2 is ready” Ed)

When you click on an invitation URL, you will have to complete the following steps:

  1. You will need to select your CNetID from a list. This will allow us to know what student is associated with each GitHub account. This step is only done for the very first invitation you accept.

Note

If you are on the waiting list for this course you will not have a repository made for you until you are admitted into the course. I will post the starter code on Ed so you can work on the assignment until you are admitted into the course.

  1. You must click “Accept this assignment” or your repository will not actually be created.

  2. After accepting the assignment, Github will take a few minutes to create your repository. You should receive an email from Github when your repository is ready. Normally, it’s ready within seconds and you can just refresh the page.

  3. You now need to clone your repository (i.e., download it to your machine).
    • Make sure you’ve set up SSH access on your GitHub account.

    • For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per project). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs52072-sum24/proj2-GITHUB-USERNAME.git.

    • If you do not know how to use git clone to clone your repository then follow this guide that Github provides: Cloning a Repository

If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.

Project overview

The goal of this project is to determine whether your project 1 optimizations helps with overall performance. Specifically, for this project you will do the following

  1. Implemented a sequential version of project #1.

  2. Perform timing tests on your implementation.

  3. Implement an additional feature that can potentially use the following features: streams with async memory copying, unified memory and/or dynamic parallelism.

Task #0: Copying over your work

Notice that your repository for proj2 is empty. To get started, copy over your code from your previous proj1 repository and make sure to git add, git commit, and git push the code to your proj2 remote repository.

Task #1: Sequential implementation

Inside the proj2/schelling.cu file, implement a sequential version of the schelling’s model you implemented in project 1. Add a -s flag to the schelling.cu program to configure the program to run sequentially. This flag must come before the file argument

./schelling -s sample_city.txt  // Runs the sequential version on a configurations sample_city.txt

By default (i.e., without the -s flag) the program runs the GPU version.

Task #2: Performance Testing & Report

For this project, we will only look at overall speedup. We will time the overall execution of the sequential and GPU version. First, create the follow configuration files

  1. 8x8 city configuration

  2. 64x64 city configuration

  3. 512x512 city configuration

  4. 1024x1024 city configuration

  5. 4096x4096 city configuration

The configuration can be a random mixture of maroon homeowners, blue homeowners, and empty locations. You can also choose the additional required parameters at random but ensure that the number of simulation steps is equal to 5 (i.e., the maximum number of steps is 5 and no less than 5). This might require you to experiment with the other parameters in the file to get this correct.

You will produce one execution bar graphs (Please refer back to that hw2 for a graph visual). The x-axis will represent the five configurations explained above. The y-axis represents the execution time for that specific configuration. For each configuration on the x-axis, it should show sequential compared to gpu execution time. For the GPU version, experiment with choosing the best grid and block sizes for that specific configuration file that results in the best performance. Feel free to add additional optimal flags to the schelling program that lets you specify the grid and block sizes based on your implementation. For example:

./schelling -g 128 128 -b 32 32 sample_city.txt  // Runs the gpu version with grid (128,128) and block sizes (32,32)

You must generate the speedup graphs using a sbatch script named generate_graph.sh. This means you cannot hand generate the graphs. We should be able to run sbatch generate-graph.sh with us just changing the SBATCH configuration settings and the graphs should be produced. You can generate the graphs using gnu-plot, python script, java, etc.

Here a few additional requirements:

  1. You will observe that timings vary a little bit each time you run your program. Please run every experiment at least 10 times, and use the average time from those runs.

  2. Make make sure to title the graph, and label each axis. Make sure to adjust your y-axis range so that we can accurately see the values. That is, if most of your values fall between a range of [0,1] then don’t make your range [0,14].

  3. The name the png file of the graph: execution_schelling.png.

  4. The script that generates teh graphs must be a file called generate-graphs.sh

  5. the generate-graphs.sh script so that it contains all commands (or calls another script that does) to reproduce your timings and your plot, i.e. the experiment should be fully automated with just calling the script as:

    sbatch generate-graphs.sh
    

README.md file

Inside the proj2/README.md file, provide explanation on your results. Focus on answering the following:

  • Where are you getting speedups in your graphs and why?

  • What areas are you not getting a speedup and why? Specifically, what parts of your GPU version is outperforming the CPU version (and vice versa)?

One-two paragraphs is sufficient for answering these questions.

Task #3: Implement a new GPU Feature

The last task focuses on implementing a new GPU features that uses streams with asynchronous memory copying, unified memory, and/or dynamic parallelism to help overall performance. Review your GPU implementation and do the following

  1. Determine at least one part of the GPU version that could benefit from either streams with asynchronous memory copying, unified memory, or and/or dynamic parallelism. Update that specific part implementing at least one of these features.

  2. Add a flag to the schelling program (of your choosing) that turns on this feature.

Answer the following questions in your README.md file: Does running this feature help with the overall performance of one of the above configuration files? Explain why or why not based on your implementation. Backup your claim with at least one execution timing of a configuration.

Grading

Programming assignments will be graded according to a general rubric. Specifically, we will assign points for completeness, correctness, design, and style. (For more details on the categories, see our Assignment Rubric page.)

The exact weights for each category will vary from one assignment to another. For this assignment, the weights will be:

  • Task 1: 20%

  • Task 2: 50%

  • Task 3: 30%

Submission

Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Project #2” assignment page via two ways,

  1. Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repository you wish to upload and the branch (which should always be “main” or “master”) for this course.

  2. Uploading via a Zip file: You can also upload a zip file of the homework directory. Please make sure you upload the entire directory and keep the initial structure the same as the starter code; otherwise, you run the risk of not passing the automated tests.

Note

For either option, you must upload the entire directory structure; otherwise, your automated test grade will not run correctly and you will be penalized if we have to manually run the tests. Going with the first option will do this automatically for you. You can always add additional directories and files (and even files/directories inside the stater directories) but the default directory/file structure must not change.

Depending on the assignment, once you submit your work, an “autograder” will run. This autograder should produce the same test results as when you run the code yourself; if it doesn’t, please let us know so we can look into it. A few other notes:

  • You are allowed to make as many submissions as you want before the deadline.

  • Please make sure you have read and understood our Late Submission Policy.

  • Your completeness score is determined solely based on the automated tests, but we may adjust your score if you attempt to pass tests by rote (e.g., by writing code that hard-codes the expected output for each possible test input).

  • Gradescope will report the test score it obtains when running your code. If there is a discrepancy between the score you get when running our grader script, and the score reported by Gradescope, please let us know so we can take a look at it.