Closed Bug 1064010 Opened 10 years ago Closed 6 years ago

Add "Android instrumentation" tests runs to buildbot config

Categories

(Release Engineering :: General, defect)

All
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: nalexander, Assigned: nalexander)

References

Details

Attachments

(4 files, 2 obsolete files)

In Bug 1064004, I'm trying to get some additional Android instrumentation test suites running in automation.  I had a long discussion with jlund, and he suggested the first step was to get a new test job running on cedar.

These instrumentation test suites are very similar to Robocop.  I intend for these suites to be "pure mozharness", to run only on modern hardware (Pandas, or possibly even emulators), and in general to take whatever simplifying steps are necessary to get them running *somewhere*.  I imagine that each named suite will be a buildbot job (like R for Robocop).
This ticket is intended to be the analog of Bug 959709, which turned on web-platform-tests on cedar.  jlund, if I prepare a patch, are you the right person to review?  If not, can you suggest a reviewer?
Flags: needinfo?(jlund)
Blocks: 1064012
Attached patch 10958.diff (obsolete) — Splinter Review
Maybe this is easier that I expected.  I looked at mozharness/scripts/android_panda.py, and it looks like I can work within that framework to do everything I want to do.
Assignee: nobody → nalexander
Status: NEW → ASSIGNED
Attachment #8485476 - Flags: review?(jlund)
Actually, now that I think of it, perhaps I don't want this on try right away.  Just cedar?
Attached patch 1064010.patch (obsolete) — Splinter Review
Here's a better version of this.  The old one forgot to add the test to all opt unit tests, so it would have (benignly) done nothing.

I've factored out a helper method to clarify.

This version also prepares for the two test suites we'll want to run, browser and background.  We'll probably want more over time, so we might as well avoid 1 vs. N errors up front.
Attachment #8485476 - Attachment is obsolete: true
Attachment #8485476 - Flags: review?(jlund)
Attachment #8485864 - Flags: review?(jlund)
gbrown, this is relevant to your interests.
(In reply to Nick Alexander :nalexander from comment #3)
> Actually, now that I think of it, perhaps I don't want this on try right
> away.  Just cedar?

Yes, you should start with just cedar. If your test is not working yet and it is enabled on try, anyone running 'all' tests might be alarmed at the failure.
Do you want to run against Android 4.0 Opt only (not Android 4.0 Debug, not Android 2.3 Opt)?
(In reply to Geoff Brown [:gbrown] from comment #7)
> Do you want to run against Android 4.0 Opt only (not Android 4.0 Debug, not
> Android 2.3 Opt)?

Yeah, that would be fine for now.  While greening these up, I see no reason to run against 4.0 Debug; and I'm not sure I want to take complexity for 2.3 at all.
(In reply to Nick Alexander :nalexander from comment #1)
> This ticket is intended to be the analog of Bug 959709, which turned on
> web-platform-tests on cedar.  jlund, if I prepare a patch, are you the right
> person to review?  If not, can you suggest a reviewer?

I touched base with nick over IRC. I will review the patch but I ran out of time today. I will look at this tomorrow.
Flags: needinfo?(jlund)
Comment on attachment 8485864 [details] [diff] [review]
1064010.patch

Review of attachment 8485864 [details] [diff] [review]:
-----------------------------------------------------------------

thanks for taking the initiative on this and attempting it yourself. I think we pretty much have it but, and I may be wrong, I think you exposed a bug in a jittest loop (see in-line)

::: mozilla-tests/mobile_config.py
@@ +1776,5 @@
>                  for suite in branch['platforms'][platform][slave_plat][type_][:]:
>                      if "cppunit" in suite[0]:
>                          branch['platforms'][platform][slave_plat][type_].remove(suite)
>  
> +def remove_suite_from_slave_platform(suite_to_remove, slave_platform, branches_to_keep=[]):

awesome

@@ +1789,5 @@
> +            continue
> +        for platform in BRANCHES[branch]['platforms']:
> +            if not platform in PLATFORMS:
> +                continue
> +            if not platform.startswith('android'):

it be nice to make this work outside mobile_config script this and move this method out later

@@ +1801,5 @@
> +                    continue
> +                for type in BRANCHES[branch]['platforms'][platform][slave_plat]:
> +                    for suite in BRANCHES[branch]['platforms'][platform][slave_plat][type]:
> +                        if suite_to_remove in suite[0]:
> +                            BRANCHES[branch]['platforms'][platform][slave_plat][type].remove(suite)

If I understand correctly, we are mutating a list while we iterate over it.

I suppose this worked before with jittest because we only had one suite we targeted with this loop ('jittest') in BRANCHES[branch]['platforms'][platform][slave_plat][type]. But I think we will lose the second 'instrumentation-*' suite as their nodes are side-by-side
Attachment #8485864 - Flags: review?(jlund) → review-
> @@ +1789,5 @@
> > +            continue
> > +        for platform in BRANCHES[branch]['platforms']:
> > +            if not platform in PLATFORMS:
> > +                continue
> > +            if not platform.startswith('android'):
> 
> it be nice to make this work outside mobile_config script this and move this
> method out later

*it be nice to make this work outside mobile_config.py but I can move this
method out later. thanks for lessoning the duplication
Attached file test.py
Trivial testing file.  Test data to follow.
Attachment #8486854 - Flags: review?(jlund)
This data is a significantly reduced subset of some data provided privately by kmoir.
This data is a significantly reduced subset of some data provided privately by kmoir.
(In reply to Nick Alexander :nalexander from comment #12)
> Created attachment 8486854 [details]
> test.py
> 
> Trivial testing file.  Test data to follow.

To test, run |python test.py| or |python test.py -v| to be verbose.
Attachment #8486854 - Attachment is patch: true
Attachment #8486854 - Attachment mime type: application/binary → text/plain
Attachment #8486854 - Attachment is patch: false
Attached patch 1064010.patchSplinter Review
OK, let's try this again.  This iterates on the previous patch by iterating the dictionary items and updating the value-list in place.  There should be no concurrent iteration-and-deletion with this scheme.

This version passes the tests given in test.py and the previously attached JSON files.
Attachment #8485864 - Attachment is obsolete: true
Attachment #8487506 - Flags: review?(jlund)
(In reply to Jordan Lund (:jlund) from comment #10)
> Comment on attachment 8485864 [details] [diff] [review]
> 1064010.patch
> 
> Review of attachment 8485864 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> thanks for taking the initiative on this and attempting it yourself. I think
> we pretty much have it but, and I may be wrong, I think you exposed a bug in
> a jittest loop (see in-line)
> 
> ::: mozilla-tests/mobile_config.py
> @@ +1776,5 @@
> >                  for suite in branch['platforms'][platform][slave_plat][type_][:]:
> >                      if "cppunit" in suite[0]:
> >                          branch['platforms'][platform][slave_plat][type_].remove(suite)
> >  
> > +def remove_suite_from_slave_platform(suite_to_remove, slave_platform, branches_to_keep=[]):
> 
> awesome
> 
> @@ +1789,5 @@
> > +            continue
> > +        for platform in BRANCHES[branch]['platforms']:
> > +            if not platform in PLATFORMS:
> > +                continue
> > +            if not platform.startswith('android'):
> 
> it be nice to make this work outside mobile_config script this and move this
> method out later

I see no reason we couldn't have this.  It looks like master_common is the correct home (like items_{before,at_least}), and we'd just need to parameterize more things.
Comment on attachment 8487506 [details] [diff] [review]
1064010.patch

Review of attachment 8487506 [details] [diff] [review]:
-----------------------------------------------------------------

this is awesome.

this piece is good to go and land once the mozharness necessary changes are in place.
Attachment #8487506 - Flags: review?(jlund) → review+
Comment on attachment 8486854 [details]
test.py

this is also sane and neat.
Attachment #8486854 - Flags: review?(jlund) → review+
(In reply to Jordan Lund (:jlund) from comment #18)
> Comment on attachment 8487506 [details] [diff] [review]
> 1064010.patch
> 
> Review of attachment 8487506 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> this is awesome.
> 
> this piece is good to go and land once the mozharness necessary changes are
> in place.

It's not clear to me why the mozharness changes are necessary for this to land.  In fact, it seems to me imperative that buildbot/CI in general be able to safely start scheduling a job where mozharness/the factory completely fails.  Then people such as myself can get "the bits that we can't control" out of the way before turning to "the bits that we can control".

As it happens, I believe the mozharness pieces have r+ from kmoir, so we can try to stage them together.
jlund, Callek: what has to happen to get this into the buildbot production bug queue?
(In reply to Nick Alexander :nalexander from comment #20)
> (In reply to Jordan Lund (:jlund) from comment #18)
> > Comment on attachment 8487506 [details] [diff] [review]
> > 1064010.patch
> > 
> > Review of attachment 8487506 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > this is awesome.
> > 
> > this piece is good to go and land once the mozharness necessary changes are
> > in place.
> 
> It's not clear to me why the mozharness changes are necessary for this to
> land.  In fact, it seems to me imperative that buildbot/CI in general be
> able to safely start scheduling a job where mozharness/the factory
> completely fails.  Then people such as myself can get "the bits that we
> can't control" out of the way before turning to "the bits that we can
> control".

Sure, I just meant if it was going to be multiple days before you landed anything on mozharness, there is no point scheduling slaves to run dead jobs.

> 
> As it happens, I believe the mozharness pieces have r+ from kmoir, so we can
> try to stage them together.

Landing these together and then iterating on what's broken make sense to me.

As far as landing goes, you can push your reviewed patches on the 'default' branch to build/buildbot-configs and build/mozharness at any time. They will be merged within a work day or two by releng. mozharness default will be 'live' on cedar and cypress. bbot-cfgs default will be a no-op across our branches.
Landed on default, running against 'ash' and 'cedar':

https://hg.mozilla.org/build/buildbot-configs/rev/34df6a529508
In prod with reconfig on 2014-09-15 08:30 PT
Blocks: 1067480
buildbot is dying
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: