Task list
What needs to be done to keep social.coop running - how often, by whom, etc.
In need of review.
Suggested classifications
...Used in the tables below.
Broad skill categories (in general assuming the task is adequately documented):
- No label: does not require any specific skills
- ~1327: requires experience of admin dashboards or GUIs
- ~1331: requires experience of the Unix shell and sys-admin
- ~1328: requires software development experience
- ~1330: requires a broad knowledge of the whole system
- ~1329: requires knowledge of unwritten arbitrary choices made by specific people, AKA "historical reasons"
(Of course we could get much more detailed than this, but we are trying to keep it relatively simple.)
Access categories - distinguishes between tasks which:
- ~1332: Require requisite log-in access, or
- No label: do not require any specific access.
Urgency labels (assigned to issues):
- high: ~1318
- medium: no label
- low: ~1320
Importance labels (assigned to issues):
- high: ~1319
- medium: no label
- low: ~1321
Urgency and Importance are inspired by the Eisenhower method
Recurring tasks
Notes:
- These aren't quite issues, but regular sources of issues.
- We could use the GitLab API to schedule the creation of issue tracker tickets for these (see issue #9)
- The "contingencies" column refers to any circumstance or event which implies when the task can or should be done.
- Ideally people have "buddies" on their jobs, rather than being the only assignee.
- The job name should link to a page documenting the job, the first assignee's initial task is to create this! (Create an issue for it if you need to recruit help for this.)
scheduled jobs | description | freq | skill | contingencies | assigned to |
---|---|---|---|---|---|
Upgrade host OS | Full host OS upgrade (eg. at end of LTS period) | bi-annually | ~1331 ~1332 | scheduled window | |
Renew domain | Renew social.coop's domains when they expire | annually | ~1331 ~1332 | expiry date (~April 2020) | |
Upgrade Mastodon | Upgrade mastodon installation | 4-monthly | ~1331 ~1332 | on point releases or demand | |
Upgrade dockerfiles | Rebuild dockerfiles with upgraded packages | 4-monthly | ~1331 ~1332 | when upgrading mastodon | |
Monitor metrics | Check disk, firewalls, emails, other stats for anomalies | daily? | ~1331 | at convenience | |
Run backups | run backup scripts, monitor status | N/A | ~1331 ~1332 | now automated | |
Renew ssl certs | Renew Let's Encrypt SSL certs | N/A | ~1331 ~1332 | now automated | |
Upgrade host packages | Upgrade (security) packages on servers | N/A | ~1331 ~1332 | now automated |
ad-hoc jobs | description | freq | skill | contingencies | assigned to |
---|---|---|---|---|---|
Manage this schedule | modify as necessary, create tickets when required | monthly? | ~1327 | at convenience | @wu-lee |
Respond to alerts | Handle notifications from automation, escalate as reqd. | daily? | ~1331 | on demand | @wu-lee |
Gitlab user support | Administer accounts | fairly rare | ~1327 ~1332 | on demand | @wu-lee |
Respond to tickets | Respond to and delegate internal requests for help | fairly rare | ~1327 ~1332 | on demand | @wu-lee |
Resolve tickets | Perform work to resolve tickets | various | on demand/at convenience | @wu-lee | |
Review merge requests | ... |
suggest scheduled work as follows:
- upgrade Mastodon/dockerfiles/packages ~3 times a year (Jan/May/Sept?), in pairs, one driving, other documenting
- include a disaster-recovery drill in this (attempt to restore a backup to a VM)
- renew domain on expiry (currently annaully; liase with finance where appropriate), should only require one person
- on call - monitor metrics regularly (daily/weekly), in pairs, rotate
- meet monthly for quick discussion
Roles
admin (3+ people)
- perform two upgrade/maintenance drills a year (one steering, one documenting)
- be generally contactable (on matrix) to assist with technical problems
- attend regular meetings
Aim for this upgrade/maintenance to be over a few hours on an agreed date, coordinating via matrix or similar.
The documenter should keep a record of what happened, and prepare a report for the next meeting.
Meetings should aim to be about an hour, although this can be flexible.
coordinator (1 person, perhaps rotating through admins)
This role rotates through admins
- compile and distribute meeting agendas
- facilitate meetings:
- introductions/apologies,
- confirmation of meeting length, next scheduled meeting
- advance through agenda points, managing duration
- facilitate decision "rounds" and resolutions? (Sociocratic influences)
- liase with other working groups (speaker)
secretary (1 person, perhaps rotating through admins)
This role rotates through admins, ideally being chair next meeting
- take minutes in meetings, summarising decisions and actions.
- liase with other working groups (listener, documenter)
at large (any number)
- attend regular meetings where possible
- contribute to tech group discussions
- assist with technical/operational/governance problems where possible
- become familiar with the operations of the tech group, to aid redundancy and trust
maintainers (as required, 1+ per project)
- maintain a project, such as the wiki server
- manage pull requests, builds/testing and tickets on the project
- optionally attend meetings
developers (as required)
contribute to a project such as the wiki server
optionally attend meetings
Aim for 3 people for upgrades, doing two spots each annually
Aim for 2-3 people on call at a time, in different zones, for a week at a time (~8 people?)
Perhaps have upgraders also on call?
Distinguish between operations (mostly scheduled) and development (mostly ad hoc)
Dev work as required - this may not be trivial, but needs to be considered on a case-by-case basis.
Some projects need management by sub-teams or indiividuals.