Dear all,
We are encountered with an issue that our job always exits halfway when running a long duration test case (around 20 minutes) which outputs nothing , and lava server reports an InfrastructureError error and prints as below :
Connection closed by foreign host.Marking unfinished test run as failed
definition: lava
result: fail
case: 0_apache-servers1
uuid: 597_1.4.2.4.1
duration: 603.53
lava_test_shell connection dropped.end: 3.1 lava-test-shell (duration 00:10:05) [ns_s1]
namespace: ns_s1
extra: ...
definition: lava
level: 3.1
result: fail
case: lava-test-shell
duration: 604.55
lava-test-retry failed: 1 of 1 attempts. 'lava_test_shell connection dropped.'lava_test_shell connection dropped.
And we just test it with a very simple python script as below:
#!/usr/bin/env python3
import time
print('Hello,world!')
time.sleep(1200)
print("Hello,Lava!")
We can see 'Hello,world!' string outputs , but there's no more output of this program found on webUI!
We just don't know what's wrong , so we have to mail to you for help!
Sincerely,
Chuan Su
Hi everyone,
Is it possible to handle git authentication in a test job ?
I need LAVA to clone a repo that can't be set to public,
and obviously it won't work because of the authentication step.
So is it possible to specify a password or a token ?
Best regards,
Axel
Dear , all
We found that when lava executed a script which may output a long string (more than 30000 bytes) in a line (only one line break), lava web UI always hung and there was no more lava log outputting and devices under test (short for DUT) were still powered until Lava Job time-out function triggered , however, after checked the whole log file we found that cases behind the hanging case were executed (there's new files generated) .
So the problem is that when lava encountered those cases lava web UI always hangs and DUTs may not be powered off when all the cases are completed !
best wishes,
Chuan Su
Dear , all
We found that when lava executed a script which may output a long string (more than 30000 bytes) in a line (only one line break), lava web UI always hung and there was no more lava log outputting and devices under test (short for DUT) were still powered until Lava Job time-out function triggered , however, after checked the whole log file we found that cases behind the hanging case were executed (there's new files generated) .
So the problem is that when lava encountered those cases lava web UI always hangs and DUTs may not be powered off when all the cases are completed !
best wishes,
Chuan Su
On Mon, 11 Dec 2018 at 11:30, Neil Williams <neil.williams at linaro.org> wrote:
> On Tue, 11 Dec 2018 at 11:28, Tim Jaacks <tim.jaacks(a)garz-fricke.com> wrote:
> >
> > Thanks, the CLI operations are very helpful for automating the process.
> > However, the docs say that all devices in "Reserved" state have to
> > have their "current job" cleared. I can use "lava-server manage devices details"
> > to check whether this field is actually set. There is no command to
> > modify it, though. Seems like using the Python API is the only way to
> > go here, right? The same applies to setting "Running" jobs to "Cancelled".
>
> https://git.lavasoftware.org/lava/lava/merge_requests/273
>
> This should get into the upcoming 2018.12 release.
Thank you very much for your quick help. The "lava-server manage jobs fail"
command takes care of clearing the "current job" field of the associated
device, do I understand that right?
Mit freundlichen Grüßen / Best regards
Tim Jaacks
DEVELOPMENT ENGINEER
Garz & Fricke GmbH
Tempowerkring 2
21079 Hamburg
Direct: +49 40 791 899 - 55
Fax: +49 40 791899 - 39
tim.jaacks(a)garz-fricke.com
www.garz-fricke.com
WE MAKE IT YOURS!
Sitz der Gesellschaft: D-21079 Hamburg
Registergericht: Amtsgericht Hamburg, HRB 60514
Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun
Hi folks,
We at Fairphone have developed a variant of the Tradefed-runner in LAVA
test-definitions that is meant to run complete Tradefed test suites on
multiple devices by making use of the shards feature in Tradefed. The
runner is currently in “staging” state. We still want to share now what
we are using and developing to see if there are more people with
interest in it. Feedback on the general approach taken would also be
much appreciated.
On the higher level, our setup works as follows:
• Use MultiNode to allocate multiple devices for one test submission.
• One “master” runs the Tradefed shell, similarly as in the
existing runner.
• The master connects to the workers’ DUTs via adb TCP/IP. These
DUTs are transparently available to Tradefed just in the same way as
USB-attached devices.
• Workers ensure that their respective DUTs remain accessible to
the master, especially in case of WLAN disconnects, reboots, crashes, etc.
Major features of our runner:
• Support for Android CTS, GTS and STS.
• Test run split into “shards” in Tradefed to run tests in parallel
on multiple devices. This allows for a major speedup when running large
test suites.
• Tradefed retry: Rerun test suites until the failure count stabilizes.
• No adb root required.
• Based on the original Tradefed runner, having at least parts of
the common code moved to python libraries.
Current limitations:
• Test executions are not always stable. This needs further
investigation.
• Test executions produce more false positives than local test
runs. This needs further investigation but is at least partially due to
using adb TCP/IP instead of a local USB connection.
• Android VTS not implemented (would require only minor changes)
Our current changes have been pushed to the tradefed_shards_with_retry
topic on Gerrit[1]. Besides the two major changes to add MultiNode adb
support and then Tradefed support on top of that, a couple of smaller
changes that could be useful on their own have also been pushed.
We are looking forward to your feedback and to joint efforts in
automating and speeding up Tradefed test executions!
Best regards,
Karsten for the Fairphone Software Team
[1]
https://review.linaro.org/q/topic:%22tradefed_shards_with_retry%22+(status:…
On Mon, 10 Dec 2018 at 20:16, Neil Williams <neil.williams at linaro.org> wrote:
> Yes, there is a problem there - thanks for catching it. I think the
> bulk of the page dates from the last stages of the migration when V1
> data was still around. I'll look at an update of the page tomorrow.
> Step 7 is a sanity check that the install of the empty instance has
> gone well, Step 9 is to ensure that the newly restored database is put
> into maintenance as soon as possible to prevent any queued test jobs
> from attempting to start. The critical element of Step 9 is to ensure
> that the lava-master service is stopped.
>
> The emphasis of the section is on ensuring that the instance only
> serves a "Maintenance" page, e.g. the default Debian "It works!"
> apache page, to prevent access to the instance during the restore.
Thanks for pointing that out, Neil. I got the point, that the Apache
server has to serve a static site during the restore process.
> Accessing the UI would involve having an alternative way to serve the
> pages. If that can be arranged, just for admins, (e.g. by changing the
> external routing to the box or redirecting DNS temporarily) then the
> UI on the instance can be used with the change that the
> lava-server-gunicorn service does not need to be stopped (because
> access has been redirected). Other services would be stopped. However,
> this would involve a fair number of apache config changes, so is best
> left to those admins who have such config already on hand.
>
> The operations can be done from the command line and that's probably
> best for these docs.
>
> Step 7 can be replaced by:
>
> lava-server manage check --deploy
>
> Step 9 can be replaced by looping over:
>
> lava-server manage devices update --health MAINTENANCE --hostname ${HOSTNAME}
>
> or, if there are a lot of devices:
>
> lava-server manage maintenance --force
>
> (This maintenance helper has been fixed in master - soon to be 2018.12
> - so older versions would use the first command & loop.)
Thanks, the CLI operations are very helpful for automating the process.
However, the docs say that all devices in "Reserved" state have to have
their "current job" cleared. I can use "lava-server manage devices details"
to check whether this field is actually set. There is no command to
modify it, though. Seems like using the Python API is the only way to go
here, right? The same applies to setting "Running" jobs to "Cancelled".
> I'll look at changing the page to use CLI operations for steps 7 and
> 9. Some labs can do the http redirect / routing method but the detail
> of that is probably not in scope for this page in the LAVA docs. I'll
> add a note that admins have that choice but leave it for those admins
> to implement.
Mit freundlichen Grüßen / Best regards
Tim Jaacks
DEVELOPMENT ENGINEER
Garz & Fricke GmbH
Tempowerkring 2
21079 Hamburg
Direct: +49 40 791 899 - 55
Fax: +49 40 791899 - 39
tim.jaacks(a)garz-fricke.com
www.garz-fricke.com
WE MAKE IT YOURS!
Sitz der Gesellschaft: D-21079 Hamburg
Registergericht: Amtsgericht Hamburg, HRB 60514
Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun
Hello everyone,
I am trying to implement a backup and restore routine for our LAVA server, based on the documentation:
https://validation.linaro.org/static/docs/v2/admin-backups.html#restoring-a…
The creation of the backup is straight-forward. I have problems with the order of the proposed restore steps, though.
Step 6 is "Stop all LAVA services". However, afterwards in step 7 it says "Make sure that this instance actually works by browsing a few (empty) instance pages." This should obviously be done before, right?
The actual problem is that step 9 says "In the Django administration interface, take all devices which are not Retired into Offline". This cannot be an ordering issue, because the LAVA services actually must not be available during these modifications. How do I use the Django admin interface, while all LAVA services are stopped?
Mit freundlichen Grüßen / Best regards
Tim Jaacks
DEVELOPMENT ENGINEER
Garz & Fricke GmbH
Tempowerkring 2
21079 Hamburg
Direct: +49 40 791 899 - 55
Fax: +49 40 791899 - 39
tim.jaacks(a)garz-fricke.com
www.garz-fricke.com
WE MAKE IT YOURS!
Sitz der Gesellschaft: D-21079 Hamburg
Registergericht: Amtsgericht Hamburg, HRB 60514
Geschäftsführer: Matthias Fricke, Manfred Garz, Marc-Michael Braun
Dear all,
I have a question when use lava.
Background:
1. I have only one hardware device with android.
2. I have a device-type jinja2 file start with "{% extends 'base-fastboot.jinja2' %}"
Here, I use "adb reboot bootloader" to enter in to fastboot.
3. I have another device-type jinja2 file start with "{% extends 'base-uboot.jinja2' %}"
Here, I use "fastboot 0" in uboot to enter in to fastboot.
Now, we have a scenario which need to test with above both methods, but we just have one device, if possible user can define some parameter in job.yaml, then can switch between the two methods just for one device? Any suggestion?
Thanks,
Larry
Hello
I got the following crash with 2018.11 on debian stretch
dpkg -l|grep lava
ii lava
2018.11-1~bpo9+1 all Linaro Automated
Validation Architecture metapackage
ii lava-common
2018.11-1~bpo9+1 all Linaro Automated
Validation Architecture common
ii lava-coordinator
0.1.7-1 all LAVA Coordinator
daemon
ii lava-dev
2018.11-1~bpo9+1 all Linaro Automated
Validation Architecture developer support
ii lava-dispatcher
2018.11-1~bpo9+1 amd64 Linaro Automated
Validation Architecture dispatcher
ii lava-server
2018.11-1~bpo9+1 all Linaro Automated
Validation Architecture server
ii lava-server-doc
2018.11-1~bpo9+1 all Linaro Automated
Validation Architecture documentation
ii lavacli
0.9.3-1~bpo9+1 all LAVA XML-RPC command
line interface
ii lavapdu-client
0.0.5-1 all LAVA PDU client
ii lavapdu-daemon
0.0.5-1 all LAVA PDU control
daemon
2018-12-04 14:14:40,187 ERROR [EXIT] Unknown exception raised, leaving!
2018-12-04 14:14:40,187 ERROR string index out of range
Traceback (most recent call last):
File
"/usr/lib/python3/dist-packages/lava_server/management/commands/lava-logs.py",
line 193, in handle
self.main_loop()
File
"/usr/lib/python3/dist-packages/lava_server/management/commands/lava-logs.py",
line 253, in main_loop
while self.wait_for_messages(False):
File
"/usr/lib/python3/dist-packages/lava_server/management/commands/lava-logs.py",
line 287, in wait_for_messages
self.logging_socket()
File
"/usr/lib/python3/dist-packages/lava_server/management/commands/lava-logs.py",
line 433, in logging_socket
job.save()
File
"/usr/lib/python3/dist-packages/django_restricted_resource/models.py", line
71, in save
return super(RestrictedResource, self).save(*args, **kwargs)
File "/usr/lib/python3/dist-packages/django/db/models/base.py", line 796,
in save
force_update=force_update, update_fields=update_fields)
File "/usr/lib/python3/dist-packages/django/db/models/base.py", line 820,
in save_base
update_fields=update_fields)
File "/usr/lib/python3/dist-packages/django/dispatch/dispatcher.py", line
191, in send
response = receiver(signal=self, sender=sender, **named)
File "/usr/lib/python3/dist-packages/lava_scheduler_app/signals.py", line
139, in testjob_notifications
send_notifications(job)
File
"/usr/lib/python3/dist-packages/lava_scheduler_app/notifications.py", line
305, in send_notifications
title, body, settings.SERVER_EMAIL, [recipient.email_address]
File "/usr/lib/python3/dist-packages/django/core/mail/__init__.py", line
62, in send_mail
return mail.send()
File "/usr/lib/python3/dist-packages/django/core/mail/message.py", line
342, in send
return self.get_connection(fail_silently).send_messages([self])
File "/usr/lib/python3/dist-packages/django/core/mail/backends/smtp.py",
line 107, in send_messages
sent = self._send(message)
File "/usr/lib/python3/dist-packages/django/core/mail/backends/smtp.py",
line 120, in _send
recipients = [sanitize_address(addr, encoding) for addr in
email_message.recipients()]
File "/usr/lib/python3/dist-packages/django/core/mail/backends/smtp.py",
line 120, in <listcomp>
recipients = [sanitize_address(addr, encoding) for addr in
email_message.recipients()]
File "/usr/lib/python3/dist-packages/django/core/mail/message.py", line
161, in sanitize_address
address = Address(nm, addr_spec=addr)
File "/usr/lib/python3.5/email/headerregistry.py", line 42, in __init__
a_s, rest = parser.get_addr_spec(addr_spec)
File "/usr/lib/python3.5/email/_header_value_parser.py", line 1988, in
get_addr_spec
token, value = get_local_part(value)
File "/usr/lib/python3.5/email/_header_value_parser.py", line 1800, in
get_local_part
if value[0] in CFWS_LEADER:
IndexError: string index out of range
2018-12-04 14:14:40,211 INFO [EXIT] Disconnect logging socket and
process messages
2018-12-04 14:14:40,211 DEBUG [EXIT] unbinding from 'tcp://0.0.0.0:5555'
2018-12-04 14:14:50,221 INFO [EXIT] Closing the logging socket: the
queue is empty
Regards