Introduction
The school gets a nightly dump of enrolment information about all students currently associated with our school or msor (either enrolled in one of our courses or declared as one of our majors). This is processed into a tab separated file that enables easy unix-based analysis of the data. Each file contains the full enrolment history of each student, and the school keeps a database of past files (each day for the past week, last friday of the month for every month).
Each line of the file contains the details of the enrolment of a student in a course, along with additional details about the student (eg, degrees, majors, gender, international/domestic). The first field of the line is the student ID, the fields are tab separated, and the file is sorted by student ID.
There are some standard tools for querying this data. There are some additional tools that pondy has created that do some additional useful queries.
To be able to use these tools:
- type
need admin-tools
at your shell to get access to the standard tools
- add
~pondy/bin/admin-tools
to your path get pondy's extensions.
The basic query tool is
students
-
students
will list all the fields of the file, along with their index (needed for some queries)
(note, this is pondy's version of the standard enrolments
command. It is roughly the same as the enrolments -p
, but does some additional cleaning up, (for example, adding a comma to A, B, and C, in the grades field so that grades sort properly with unix sort
).
-
students
query-terms
- lists all the entries in data file that match the query terms.
- no arguments will list the fields in the data file
- a query term can either be a pattern to match in the whole line, or it can be preceded by a field number. eg
-
students comp102 2:2014 3:1 "\sbehons\s"
will list all the course enrolment lines for students taking comp102 in 2014 in trimester 1 and who are enrolled for a BE.
-
students-single
query-terms
lists all the entries for each student that has any course enrolment matching all the query terms. (ie, the complete history for each such student)
-
students-multi
query-terms
Lists the complete history for each student who has course enrolments matching each of the query terms.
All three queries take flag arguments to specify a past year (-y YYYY) or a specific date (-d YYYY-MM-DD)
You can pipe the results of these query (or anything else) through the following:
-
project
- field-numbers
selects just the specified fields in the order specified
-
select
- query-terms
performs the same query as the student
command above, but on its input rather than the original data file
. (students
uses select
). [use selectdb
if you are on a bash shell]
-
joinup
-
given several lines with the same value in their first field, puts all the remaining fields of all the lines onto a single line: eg
a 1
a 2
b 3
b 5
b 6
turns into
a 1 2
b 3 5 6
-
unjoinup
-
does the reverse of joinup
-
students-tt
query-terms
lists the courses for the current year of students matching the query terms, and then lists the timetable of all the courses
-
students-count
query-terms
enrolment summary of courses matching the query terms
-
paf
query-terms
gives a summary of the points and courses for the enrolments matching the query terms. Mosty useful only when the query terms are selecting all the courses of a single student (eg their ID number)
Two other useful databases:
-
rooms
query-terms
list the entries in the rooms file that match the terms
-
tt
query-terms
lists all the entries in the current year's timetable that match the terms