Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
D
docs
Manage
Activity
Members
Labels
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
GitLab community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
lava
docs
Commits
a61f6155
Commit
a61f6155
authored
2 years ago
by
Dave Pigott
Browse files
Options
Downloads
Patches
Plain Diff
Add deployment and LAVA team roadmap
parent
e7d16380
Branches
Branches containing commit
Tags
Tags containing commit
1 merge request
!58
Add deployment and LAVA team roadmap
Pipeline
#55193
passed
2 years ago
Stage: test
Changes
2
Pipelines
1
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
content/lab-deployment-plan.md
+70
-0
70 additions, 0 deletions
content/lab-deployment-plan.md
content/roadmap.md
+133
-0
133 additions, 0 deletions
content/roadmap.md
with
203 additions
and
0 deletions
content/lab-deployment-plan.md
0 → 100644
+
70
−
0
View file @
a61f6155
---
title
:
Collabora Lava Lab device deployment plan
---
## In stock, ready for deployment
### Ampere emag servers
*
Quantity: 4
*
Ready to go in next batch
*
1 that can go in straight away, just needs some testing with the dhcp for the correct efi grub to be sent to it
*
3 others need some firmware flashing as well.
*
[
T34634
](
https://phabricator.collabora.com/T34634
)
## In stock, awaiting dependencies
### Chromebook Tomato (cherry) Acer
*
Quantity: 12
*
Ready to be deployed when new dispatchers are set up
*
[
T36522
](
https://phabricator.collabora.com/T36522
)
### Renegade elite (rk3399)
*
Quantity: 5
*
just needs a tweak to the docs to point to the right firmware to flash them so that if we add more or need to re-flash we are doing the same as they currently are set up
*
[
T38110
](
https://phabricator.collabora.com/T38110
)
### Rock 5B
*
Quantity: NA
*
A couple that could go in but they are of a different spec. Lower priority
### Ampere Mt jade
*
Quantity: 1(?)
*
Awaiting confirmation we can use it in the Lab – unit we have was pre-release
## With engineer for integration
### Chromebook Berknip (zork) HP
*
Quantity: 6
*
One on staging so Laura can work on depthcharge
*
[
T40291
](
https://phabricator.collabora.com/T40291
)
### Chromebook Dewatt (guybrush) Acer
*
Quantity: 12
*
In bring up with Laura/Lucas
*
[
T39039
](
https://phabricator.collabora.com/T39039
)
### Apertis potential new renesas
*
Quantity: 5
*
Apertis working on roadmap for lab deployment
### Chromebook Kaisa (puff)
*
Quantity: 12
*
On it’s way to Laura for her to work on depthcharge
*
[
T40243
](
https://phabricator.collabora.com/T40243
)
### Chromebook Volmar (brya) Acer
*
Quantity: 12
*
On it’s way to Laura for to work on the depthcharge
*
[
T40244
](
https://phabricator.collabora.com/T40244
)
## Awaiting stock
### Chromebook arcada
*
Quantity: NA
*
Not with us yet. Nick working on customs invoice with google
### Chromebook Volteer
*
Quantity: 5 or 10
*
Not with us yet. Nick working on customs invoice with google
*
Mesa would like some more of these before the split so we are in communication with google to source more.
*
[
T39591
](
https://phabricator.collabora.com/T39591
)
## Potential but unknown
### TI AM62xx ?? 5 – 10
*
Quantity: 5 - 10
This diff is collapsed.
Click to expand it.
content/roadmap.md
0 → 100644
+
133
−
0
View file @
a61f6155
---
title
:
LAVA team roadmap
---
## LAVA Development
### Internals (T31327)
#### Review internal/external LAVA server-worker API
*
Find differences between internal/external flow
*
Verify if it can be unified
*
Reduce LAVA code bas by reusing common components
#### Improve job logs
*
Lower occurrences of "Listened connection for namespace '%s' for up to %ds" message (T37051)
*
Consider
`\r`
as a valid line end marker when monitoring the DUT's console (T37054)
-
Issue reported upstream: https://git.lavasoftware.org/lava/lava/-/issues/561
*
Allow keeping escape control characters (T37055)
-
Both items above resolved by: https://gitlab.collabora.com/lava/lava/-/merge_requests/120 (to be upstreamed)
#### Traffic reduction (T32184)
*
Main goal is to provide new more efficent ways for handling logs
*
Keep in mind to document any dropped solution proposal
*
Start by mimicing Open Build Service log handling
#### Benchmarks (T32182)
*
Ping upstream for review, update demo-related branches across all relevant repositories (less than half day)
*
Add benchmarks for frequently used API endpoints (less than quarter day)
*
Enable benchmarking pipeline at least in the internal GitLab (less than half day)
*
Extend benchmarking scenarios (for generated database and tests)
*
Review bottlenecks found by benchmarks (preferably with solution proposals)
*
Submit a blog post with rationale and implementation details
### Option for disabling viewinggroups
*
[
LAVA MR 1942
](
https://git.lavasoftware.org/lava/lava/-/merge_requests/1942
)
*
Awaiting approval or decline by LAVA Team
### Revise stats collection in the database
*
Review index usage and look for little used ones – drop them from Django or from Postgres
### Postgres Vacuum
*
Periodic stall check Kubernetes provides support for long running crontaabs
### DB Use cases
*
Which package should it be put in? lava-dev, lava-debug? (latter does not exist yet)
### Job output compression
*
currently timing out - do binary chop on compression period
## LAVA CI
*
Results comparison using internal pytest-benchmark mechanism
## Security
### Codebase review
*
Run as gitlab runners?
####Automated scanning:
*
[
Verifying Django generated HTML
](
https://github.com/peterbe/django-html-validator
)
*
[
Finding security flaws in python
](
https://pypi.org/project/bandit/
)
*
[
Being fixed by LAVA team
](
https://git.lavasoftware.org/lava/lava/-/issues/584
)
*
[
Python code quality checker
](
https://github.com/PyCQA/pyflakes
)
## System administration
### Resource issues
*
What if someone is unavailable - how do we mitigate - create a plan
### Alerting for predictable defects
*
If support services are unavailable or are about to become unavailable, alert and remedy.
### Storing and extracting metadata: Loki, Prometheus/Victoria/Mimir
*
Kubernetes only stores 10MB data – large logs, and we lose data. Develop a mitigation strategy/
*
Sometimes Loki loses connection after upgrade. Investigate underlying causes
### Postgres optimization
*
Use Unix sockets instead of TCP, outline comparison
*
Find out what the performance benefit, if any, would result
### Dispatcher version synchronisation
*
Plan a move to lavapeur and automated upgrades
### Device controllers
#### Fleet management
#### Conserver, PDU control etc, etc.
*
Analyse actual reasons for issue occurrences
#### Align deployment
*
Docker image alignment with upstream
#### Consider Prometheus alternatives
*
Investigate and produce a plan if suitable alternative found
## Monitoring
### Revisit db index usage
*
[
How often is it updated?
](
https://monitoring.core.collabora.dev/d/IDWko4VVk/postgresql-stats
)
*
Replace ratio value with cache misses
*
Add Grafana alerts for potential defects
## LAVA Lab device integration and deployment
*
See deployment road map in gitlab
## Operator's perspective
### Hardware management
#### Configuration and fleet management (controller boards)
*
Unify configuration management to use Ansible, e.g. for device configuration changes rollout (T21468)
*
Move DUT controlling utilities (pdudaemon, conserver, etc.) from dispatcher to external
[
Target Managers
](
https://elinux.org/Test_Glossary
)
#### Operator's routines
*
Provide a list of _known failures_ (e.g. pending external support) to prevent ignoring new alerts
*
Add a _"blame hardware"_ CronJob for issues resolved by reseating connections
### Administration and integration
#### Monitoring cloud-friendliness (T32181)
*
Check which tracing solution (Sentry, Jaeger, etc.) fits best with current setup
*
Provide minimal working setup for initial testing and change verification
*
Add tracing service to the deployment
#### Investigate available storage solutions
*
Take into account other products than Kubernetes volumes
*
Compare benefit-to-cost ratios
*
Keep in mind storage size reduction efforts (outdated jobs, job artifacts removal)
#### Component upgrades: Synchronize dispatcher version with server
*
Determine how the dispatcher version is exposed and when upgrade should be enforced (half day)
*
Verify if upstream approach with host daemon can be reused or improved (half day - a day)
*
Verify dispatcher upgrade mechanism with Kubernetes-based server (half-day)
#### Component upgrades: Extend component version management
*
Set up mirror repository with a CI job triggered by a new tag (less than half day)
*
Rebase staging branch on the new release assuming no merge conflicts - to be reviewed manually (half day)
*
Determine which components might need version pinning/manual upgrades (if any)
#### Batch processing
*
Parse job output from
[
lava-gitlab-runner
](
https://gitlab.collabora.com/lava/lava-gitlab-runner
)
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment