Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
A student homework/exam evaluation framework build on pythons unittest framework.
Unitgrade is an autograding framework which enables instructors to offer automatically evaluated programming assignments in a maximally convenient format for the students.
Unitgrade is build on pythons unittest
framework; i.e., you can directly use your existing unittests without any changes. It will therefore integrate well with any modern IDE. What it offers beyond unittest
is the ability to collect tests in reports (for automatic evaluation)
and an easy and safe mechanism for verifying results.
unittest
compatibleOnline autograding services often say that they have adapter their particular model in order to make students better or happier. I did a small thought-experiments, and asked myself what I would ideally want out of an autograder if I was a student. I quickly realized the only thing I really cared about was easily it allowed me to fix bugs in my homework assignments. In other words, I think students prioritize the same thing as we all do when we write software tests -- to quickly and easily identify and fix problems.
However, I would not use an online autograder for any of my own software projects for a number of reasons:
alt+tab
to an external tool when my IDE already has excellent test plugins?print
-statements is often not readily available; I don't know any services that shows them liveThis raises the question that if I would not want to use an online autograder as a way to fix issues in my own software projects, why should students prefer it?
The alternative is in my view obvious -- simply give students a suite of unittests. This raises some potential issues such as safety and administrative convenience, but they turned out to be easy to solve. If you want to learn more about developing tests see the test-development repository here: https://gitlab.compute.dtu.dk/tuhe/unitgrade_private
Unitgrade requires python 3.8 or higher, and can be installed using pip
:
pip install unitgrade
After the command completes you should be all. If you want to upgrade an old version of unitgrade run:
pip install unitgrade --upgrade --no-cache-dir
If you are using anaconda+virtual environment you can also install it as you would any other package:
source activate myenv
conda install git pip
pip install unitgrade
When you are done, you should be able to import unitgrade. Type python
in the termial and try:
>>> import unitgrade
Your homework assignments are called reports and are distributed as a regular .py
-files. In the following I will use cs101report1.py
as an example, and you can find a real-world example here: https://gitlab.compute.dtu.dk/tuhe/unitgrade_private/-/blob/master/examples/example_simplest/students/cs101/report1.py .
A report is simply a collection of questions, and each question may in turn involve several tests.
I recommend running the tests through your IDE. In pycharm, this is as simple as right-clicking on the test and selecting Run as unittest
:
The outcome of the tests are shown in the lower-left corner, and in this case they are all green meaning they have passed. You can see the console output generated by a test by clicking on it. If a test fails, you can select debug as unittest
from the menu above to launch a debugger, and you can right-click on individual tests to re-run them.
To check your score, you have to run the main script (cs101report1.py
). This can be done either through pycharm (Hint: Open the file and press alt-shift-F10
) or in the console by running the command:
python cs101report1.py
The file will run and show an output where the score of each question is computed as a (weighted) average of the individual passed tests. An example is given below:
_ _ _ _ _____ _
| | | | (_) | | __ \ | |
| | | |_ __ _| |_| | \/_ __ __ _ __| | ___
| | | | '_ \| | __| | __| '__/ _` |/ _` |/ _ \
| |_| | | | | | |_| |_\ \ | | (_| | (_| | __/
\___/|_| |_|_|\__|\____/_| \__,_|\__,_|\___| v0.1.29.0, started: 16/09/2022 13:47:57
02531 week 5: Looping (use --help for options)
Question 1: Cluster analysis
* q1.1) clusterAnalysis([0.8, 0.0, 0.6]) = [1, 2, 1] ?.............................................................PASS
* q1.2) clusterAnalysis([0.5, 0.6, 0.3, 0.3]) = [2, 2, 1, 1] ?.....................................................PASS
* q1.3) clusterAnalysis([0.2, 0.7, 0.3, 0.5, 0.0]) = [1, 2, 1, 2, 1] ?.............................................PASS
* q1.4) Cluster analysis for tied lists............................................................................PASS
* q1) Total.................................................................................................... 10/10
Question 2: Remove incomplete IDs
* q2.1) removeId([1.3, 2.2, 2.3, 4.2, 5.1, 3.2,...]) = [2.2, 2.3, 5.1, 3.2, 5.3, 3.3,...] ?........................PASS
* q2.2) removeId([1.1, 1.2, 1.3, 2.1, 2.2, 2.3]) = [1.1, 1.2, 1.3, 2.1, 2.2, 2.3] ?................................PASS
* q2.3) removeId([5.1, 5.2, 4.1, 4.3, 4.2, 8.1,...]) = [4.1, 4.3, 4.2, 8.1, 8.2, 8.3] ?............................PASS
* q2.4) removeId([1.1, 1.3, 2.1, 2.2, 3.1, 3.3,...]) = [4.1, 4.2, 4.3] ?...........................................PASS
* q2.5) removeId([6.1, 3.2, 7.2, 4.2, 6.2, 9.1,...]) = [9.1, 5.2, 1.2, 5.1, 1.2, 9.2,...] ?........................PASS
* q2) Total.................................................................................................... 10/10
Question 3: Bacteria growth rates
* q3.1) bacteriaGrowth(100, 0.4, 1000, 500) = 7 ?..................................................................PASS
* q3.2) bacteriaGrowth(10, 0.4, 1000, 500) = 14 ?..................................................................PASS
* q3.3) bacteriaGrowth(100, 1.4, 1000, 500) = 3 ?..................................................................PASS
* q3.4) bacteriaGrowth(100, 0.0004, 1000, 500) = 5494 ?............................................................PASS
* q3.5) bacteriaGrowth(100, 0.4, 1000, 99) = 0 ?...................................................................PASS
* q3) Total.................................................................................................... 10/10
Question 4: Fermentation rate
* q4.1) fermentationRate([20.1, 19.3, 1.1, 18.2, 19.7, ...], 15, 25) = 19.600 ?....................................PASS
* q4.2) fermentationRate([20.1, 19.3, 1.1, 18.2, 19.7, ...], 1, 200) = 29.975 ?....................................PASS
* q4.3) fermentationRate([1.75], 1, 2) = 1.750 ?...................................................................PASS
* q4.4) fermentationRate([20.1, 19.3, 1.1, 18.2, 19.7, ...], 18.2, 20) = 19.500 ?..................................PASS
* q4) Total.................................................................................................... 10/10
Total points at 13:48:02 (0 minutes, 4 seconds)....................................................................40/40
Provisional evaluation
--------- -----
q1) Total 10/10
q2) Total 10/10
q3) Total 10/10
q4) Total 10/10
Total 40/40
--------- -----
Note your results have not yet been registered.
To register your results, please run the file:
>>> looping_tests_grade.py
In the same manner as you ran this file.
Once you are happy with your results and want to hand in, you should run the script with the _grade.py
-postfix, in this case cs101report1_grade.py
(see console output above):
python cs101report1_grade.py
This script will run the same tests as before and generates a file named Report0_handin_18_of_18.token
(this is called the token
-file because of the extension). The token-file contains all your results and it is the token-file you should upload (and no other). Because you cannot (and most definitely should not!) edit it, it shows the number of points in the file-name.
I recommend to watch and run the tests from your IDE, as this allows you to use the debugger in conjunction with your tests. However, I have put together a dashboard that allows you to see the outcome of individual tests and what is currently recorded in your token
-file. To start the dashboard, simply run the command
unitgrade
from a directory that contains a test (the directory will be searched recursively for tests). The command will start a small background service and open this page:
Features supported in the current version:
.token
-fileNote that the run feature currently assumes that your system-wide python
command can run the tests. This may not be the case if you are using virtual environments -- I expect to fix this soon.
Why is there two scripts?
The reason why we use a standard test script (one with the _grade.py
extension and one without), is because the tests should both be easy to debug, but at the same time we have to avoid accidential changes to the test scripts. The tests themselves are the same, so if one script works, so should the other.
My non-grade script and the _grade.py
script gives different number of points
Since the two scripts should contain the same code, the reason is with near certainty that you have made an (accidental) change to the test scripts. Please ensure both scripts are up-to-date and if the problem persists, get support.
Why is there a unitgrade
-directory with a bunch of pickle files? Should I also upload them?
No. The file contains the pre-computed test results your code is compared against. You should only upload the .token
file, nothing else
I am worried you might think I cheated because I opened the '_grade.py' script/token file This should not be a concern. Both files are in a binary format (i.e., if you open them in a text editor they look like garbage), which means that if you make an accidential change, they will with all probability simply fail to work.
I think I might have edited the report1.py
file. Is this a problem since one of the tests have now been altered?
Feel free to edit/break this file as much as you like if it helps you work out the correct solution. However, since the report1_grade.py
script contains a seperate version of the tests, please ensure both files are in sync to avoid unexpected behavior.
The course material should contain information about the intended function of the scripts, and the file report1.py
should mainly be used to check which of your code is being run. In other words, first make sure your code solves the exercises, and only later run the test script which is less easy/nice to read.
However, obivously you might get to a situation where your code seems to work, but a test fails. In that case, it is worth looking into the code in report1.py
to work out what exactly is going on.
One possibility that might trick some is that if the test compares a value computed by your code, the datatype of that value may be important. For instance, a list
is not the same as a python ndarray
, and a tuple
is different from a list
.
The report1.py
class is really confusing. I can see the code it runs on my computer, but not the expected output. Why is it like this?
To make sure the desired output of the tests is always up to date, the tests are computed from a working version of the code and loaded from the disk rather than being hard-coded.
How do I see the output of my programs in the tests? Or the intended output?
There are a number of console options available to help you figure out what your program should output and what it currently outputs. They can be found using:
python report1.py --help
Note these are disabled for the report1_grade.py
script to avoid confusion. It is not recommended you use the grade script to debug your code.
Since I cannot read the .token
file, can I trust it contains the same number of points internally as the file name indicate?
Yes.
I managed to reverse engineer the report1_grade.py
/*.token
files in about 30 minutes. If the safety measures are so easily broken, how do you ensure people do not cheat?
That the script report1_grade.py
is difficult to read is not the principle safety measure. Instead, it ensures there is no accidential tampering. If you muck around with these files and upload the result, we will very likely know you edited them.
I have private data on my computer. Will this be read or uploaded?
No. The code will look for and include yours solutions in the .token
-file, but it will not read/look at other directories in your computer. As long as your keep your private files out of the directory that contains your homework you have nothing to worry about.
Does this code install any spyware/etc.? Does it communicate with a website/online service?
Unitgrade makes no changes outside the courseware directory, and it does not do anything tricky. It reads/runs code and produces the .token
file. The development version of unitgrade has an experimental feature to look at a github page and check your version fo the tests is up-to-date, but this is currently not enabled and all this would do is to warn you about a potential problem with an outdated test.
I still have concerns about running code on my computer I cannot easily read Please contact me and we can discuss your specific concerns.
@online{unitgrade,
title={Unitgrade (0.1.29.0): \texttt{pip install unitgrade}},
url={https://lab.compute.dtu.dk/tuhe/unitgrade},
urldate = {2022-09-16},
month={9},
publisher={Technical University of Denmark (DTU)},
author={Tue Herlau},
year={2022},
}
FAQs
A student homework/exam evaluation framework build on pythons unittest framework.
We found that unitgrade demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.