Sheriff Log: Chromium OS

          2014-04-17

          Sheriffs: milleral, thieule

          • [364617] Beaglebone servo image is too big
          • [356020] [bvt] reset Failure on falco-chrome-pfq/R35-5684.0.0-rc4
          • [355843] TreeCloser: build failure in DebugSymbols 600sec timeout
          • [364669] Daisy skate build is failing on serial-tty
          • [359223] [monroe] graphics_SanAngeles suspected to reboot/hang machine
          • [358737] [bvt] graphics_GpuReset Failure on stout-release/R36-5718.0.0

          2014-04-16

          Sheriffs: milleral, thieule

          2014-04-15

          Gardener: derat

          • [363884] LKGMSync step failing repeatedly due to bad SHA1 when syncing coreboot
          • started new x86-generic nightly chromium PFQ build since last night's run died on some slaves

          2014-04-14

          Sheriffs: jchuang, reinauer, Gardener: derat

          • [363339] stumpy moblab failure 
          • [363294] sandybridge-canary failed.
          • [362999] Failed cbuildbot failed hwtest [bvt] [parrot_ivb]. Transient.
          • [358737, 356020] Transient HWtest fail on falco and wolf (both issues have been auto filed many times)
          • [363015] Failed cbuildbot failed debugsymbols [x86-mario]
          • [363167] HWTest step timed out on daisy_spring and falco PFQ

          2014-04-11

          Sheriffs: olofj, adlr, josephshi, ihf

          • [362621] Removed pyauto dependencies that broke PFQ.

          2014-04-05

          Sheriffs: keybuk, bfreed

          • [339291] Reverted a set of CLs that caused platform_Powerwash failure.
          • [360797] The chumps from 339291 broke incremental builders and required deputy assistance.
          • [360898] video_DecodeAccelerator is increasingly unreliable.  Maybe we should make it experimental?

          2014-04-04

          Sheriffs: keybuk, bfreed

          • [360084] rambi-b canary build failed in the Archive stage on loopback mount failure.  Believed transient.
          • [360082] Chrome PFQ fails with unknown linker flag (--reduce-memory-overheads), likely because of https://codereview.chromium.org/225093005 last night.

          2014-04-02 & 2014-04-03

          Sheriffs: pprabhu, dgreid

          • [359143] generate_payload failed to unmount a file system, and we tried to rm it later. pprabhu@ forced canaries to restart, since ongoing runs were all going to hit this issue. But it took a while to take this decision.
            This hit us again later in the day, keeping canaries red almost all day. The reason was that a script had to be manually upreved to pull in the revert. See CL.
          • [359227] VMTest hung. Root cause unknown.
          • [359422] VMTest failed because VM ran out of space when running the tests. We reverted the Cl in the morning. Unfortunately, although all the canaries are back online, we can't get the CQ to pass yet, due to flakes + lots of CLs trying to get in. So, we need to uprev the manifest by allowing a noop CL through a throttled tree.
          • [348199] and [353590] daisy_spring-pfq failed because of known GOLO and update engine flakes.
          • [359760] beaglebone_servo canary is currently broken. [TODO: If the canary is still red at EOD, revert the CL mentioned in the bug].

          2014-03-31

            Sheriffs: katierh, dparker

            • [358180] Daisy canary builder failure caused by error in a git repro. Existing error turned into a warning due to crbug.com/352692
            • [358075] daisy_skate and daisy_spring canary failing due to clustered Chrome builds on one builder. Variants not using pre-built of Chrome from daisy.

            2014-03-28

            Sheriffs: dbasehore, armansito, sheckylin

            • [357364] Tree doesn't close anymore when builds fail. Looks like it's fixed now.
            • repeated failures in daisy canary during build packages.

            2014-03-27

            Sheriffs: dbasehore, armansito, sheckylin

            • [353906] Builder out of space
            • [357093] x86 generic ASAN fail due to Chrome
            • [357202] Pre-CQ timeout.

            2014-03-25

            Sheriffs: dianders, vbendeb

            • [356187] widespread provision failures; waiting for lab sheriff for the most part
            • [356198] and [356199] video_VideoDecodeAccelerator - probably a duplicate of [353898].  There's  a possible fix for that but it hasn't gone back to R34 yet.
            • ... lab issue is hopefully fixed now ...
            • ... David James and crew restarted CQ ...
            • ... various things handled by David James ...
            • [355843] beltino canary - DebugSymbols hung for 600 seconds
            • [348188] slippy canary and daisy canary - Flood of "Too many open files"
            • Chrome uprev has failed a few times; Chrome sheriff handling?

            2014-03-21 and 2014-03-24

            Sheriffs: jrbarnette, tbroch

            • [355843] beltino canary: DebugSymbols failed during upload with timeout
            • daisy incremental: CQ bug missed bump of chromeos-init for CL:190619 initially (race?) then got it fixed making manual override here unnecessary.
            • [353018] sandybridge canary: OSError(16, 'Device or resource busy') ... believed to be not enough loopback devices.
            • [354573] A bug in Chrome caused the x86-generic paladin to fail multiple times in VM testing.

            2014-03-19 and 2014-03-20

            Sheriffs: snanda

            • [344506]: peppy canary failed to reboot due to ASIX USB issue.
            • [352276]: falco canary platform2 build failure.  http://crosreview.com/190820 is the fix but still waiting to be blessed by CQ.
            • [354496]: monroe paladin misbehaved for a while.
            • [354262]: sandybridge build failed.  DUT was down?
            • [311350]: platform_Powerwash Failure on daisy_spring-release. USB dongle flakiness?

            2014-03-17 and 2014-03-18

            Sheriffs: yjlou, wfrichar, victoryang, hungte (TPE)

            • [352994] cros_generate_test_payloads failed to find image folder (race condition)
            • [353429] chrome/chromium pfq bots died in build_image due to missing libmojo_system.so
            • [353461] failuer in uploading DebugSymbols

            2014-03-13 and 2014-03-14

            Sheriffs: cywang(TPE), dgarrett, bleung, gwendal

            • [348855] amd64-generic-asan: logging_UserCrash timed out (flaky)
            • [352093] daisy_spring: HWtest job timeout, but tests are still running
            • [350677] x86-generic-full : cryptohome fails to link.
            • [352276] platform2-0.0.1-r366 fails on arm-generic full
            • [352297] Pre-CQ Failure- Gerrit Code Review requires Java 7
            • [348855] amd64-generic-asan: logging_UserCrash timed out
            • [352428] x86-generic asan : logging_AsanCrashTelemetry : Unhandled TabCrashException: Handshake Status 500
            • [72633] x86-generic incremental: login_OwnershipNotRetaken
            • [352520] atom canary: x86-mario: build_image failed (can't read superblock)

            2014-03-11 and 2014-03-12

            Sheriffs: dlaurie, grundler

            2014-03-05 and 2014-03-06

            Sheriffs: miletus, shawnn

            • [337490] daisy incremental unmount completed, but returned an error
            • [348855] amd64-generic-asain, logging_UserCrash timed out
            • [348758] x86-generic-asan failure. not sure how to interpret the failure message.
            • [349559] Signer failure on all canaries
            • [343442] Wolf Paladin builder wedged
            • [349597] chrome-internal-fetch netrc credentials revoked

            2014-03-03 and 2014-03-04

            Sheriffs: quiche, vpalatin

            • [348607] Chrome PFQ failure. later Chrome builds cycled green.
            • [345501] platform_FilePerms: jrbarnette@ has CL checked in, but it wasn't picked up by the lab server. lab team will update its server.
            • [347932] security_AccountsBaseline (multiple times). cmasone@ investigating.
            • [348059] chromiumos sdk builder failing. pinged bug.
            • [348758] x86-generic-asan failure. not sure how to interpret the failure message.
            • [348799] stumpdgarrett, bleung, gwendaly-paladin, reboot failure in autoupdate_CatchBadSignatures
            • [348805] x86-generic, e2fsprogs failed to emerge
            • [330670] breakpad unittest failure on amd64-generic
            • [348855] amd64-generic-asain, logging_UserCrash timed out (2x)
            • [348889] duck canary failure
            • [345491] x86-mario canary: GSResponseError 403
            • [349073] parrot canary: platform_PowerWash failed
            • [337490] daisy incremental unmount completed, but returned an error
            • [349187] x86-mario canary failed in ChromeSDK: out of disk?
            • [349292] duck canary, GS_ERROR: Attempt to get key

            2014-02-19 and 2014-02-20

            Sheriffs: olofj, pstew

            • mario-canary fails in UReadAheadServer.  No logs.
            • Chrome uprev failed on thermal, dianders@ to revbump package, but failed again
            • [344914] CQ failing due to failure to build hostapd, deemed to be a corrupted tarball in the buildier's cache.
            • [345098] New print_repo_status.py factory install script broke the archive process.
            • [345210] Rash of signer test failures (alex, slippy, parrot, leon)
            • [345479] VMTests fail with 'NoneType' object has no attribute 'Cleanup'
            • [345491] GS AccessDenied error while uploading prebuilts for slippy_canary.  Invoking troopers.
            • [345476] login_CryptohomeIncognitoTelemetry and ScreenLockerTelemtry [344849] causing chrome uprev issues

            2014-02-17 and 2014-02-18

            waihong (TPE), bhthompson, marcheu:

            • peach_pit canary hwtest flake -> crobug.com/344427
            • amd64-generic paladin machine went offline for a while. Contacted the Trooper to fix.
            • stumpy canary hwtest flake - happened again, crbug.com/344173

            2014-02-13 and 2014-02-14
            reinauer, garnold

            • beaglebone canary failed on DebugSymbols stage; appears to be a flake (crbug.com/344059).
            2014-02-12

            If you are seeing double-free/heap corruption errors when running gn during ChromeSDK runs (e.g.:

            https://uberchromegw.corp.google.com/i/chromeos/builders/x86-alex%20canary/builds/4655/steps/ChromeSDK%20%5Bx86-alex%5D/logs/stdio)

            it's probably crbug.com/335587. Please see my explanation in the bug.

            - posciak

            2014-02-11

              benchan, sosa, owenlin:

              2014-02-10

              dkrahn, adlr, kcwu (TPE):

              • Due to DiRT, MTV was offline and internal waterfall was affected ~14:00 - 17:00. Some buildbots were affected as well.
              • Another chrome pfq vmtest failure on falco. crbug.com/212879 - these don't actually block uprev, see crbug.com/342425
              • Google Storage issues (fiber cut) cause canary failures - https://a.corp.google.com/#102649
              • Filed crbug.com/342497 for link canary build error.
              • linux_chromeos dbg 2 bot has very long cycle time (~4 hours) so failures may show up late, filed crbug.com/342588

              2014-02-07

              dkrahn, adlr, kcwu

              • mario canary hwtest bvt failure - emailed troppers to escalate. This has been going on for a while now, it seems
              • chrome pfq vmtest failure - next build cycled green, so letting this one go
              • falco chrome pfq vmtest failure - crbug.com/212879
              • mario canary hwtest bvt failure ended up being a combination of crbug.com/339702 and no logs reported - crbug.com/341494
              • investigated repeated failures on x86-generic asan builder - filed crbug.com/341922
              • daisy_spring starvation in the lab due to crbug.com/339636
              • wolf_canary lab flake due to crbug.com/340839

              2014-02-06

              skuhne, jwerner, thieule:

              • Tryserver unavailable ~13:00 - 16:00
              • HW lab down 14:39 - 15:46
              • crbug.com/341658: dev_install failed due to connection timeout
              • crbug.com/337490: amd64-generic-incremental build failure during build image due to unmountable partition. Device busy.
              • crbug.com/212879: system sometimes does not come up after reboot in VMTest
              • SimpleTestAndVerify fails on x86generic ASAN. Saw that yesterday already, but there were many more problems.. (-> crbug.com/337848)
              • PFQ for daisy_spring is still failing in HWTests since there are apparently no machines since 6 days (-> crbug.com/339636)

              2014-02-05

              mtennant, skuhne, jwerner, thieule (MTV):

              • PFQ for falco, lumpy, .. had failures. Might be fluke upon "Failed to run /home/chrome-bot/depot-tools/gclient runhooks". Retriggered / clobbered / but no success (-> crbug.com/341179) PFQ seems still to be broken @5:45pm, but it will cycle tomorrow morning green since Chrome was red as well.
              • PFQ x86: lab failure or restarting VM's (-> crbug.com/212879)
              • Several x86-generic and amd64-generic failures in SimpleTestUpdateAndVerify (-> crbug.com/212879), VM hung on reboot, rolling back 3.10 kernel switch on generic to fix)
              • daisy canary - random tests failed due to crashdumps from (non-fatal) Xorg crashes that took too long to symbolize (client test is marked GOOD but server job times out)
              • daisy_spring canary - all tests after a certain point in the suite failed with ABORT (suite hit 2h timeout since not enough lab devices available soon enough to finish on time)
              • dev_install test is failing on canaries - crbug.com/341266 (offending change was rolled back with some difficulties problem was identified and will be fixed on reupload)

              2014-02-04

              posciak (TOK)

              dianders, rspangler (MTV):

              2014-02-03

              dianders, rspangler

              2014-01-31

              tbroch, jrbarnette

              • crbug.com/339934 race/flake for tar during DebugSymbols ... tar: debug/bin: file changed as we read it
              • crbug.com/335587 double-free/corruption errors when running gn during ChromeSDK stage
              • crbug.com/339743 [bvt] network_VPNConnect.l2tpipsec_cert Failure
              • crbug.com/337490 mount failure during build_image (error status 32)

              posciak (TOK)

              • See crbug.com/335587 for a probable reason behind occasional double-free/corruption errors when running gn during ChromeSDK stage
              • build failures as CQ missed one of the CQ-DEPEND CLs, because it was uploaded as a draft
              • login timeouts on link canary in a few bvt tests; suspecting https://codereview.chromium.org/148843002 to have made login last longer/stop working... may need to followed up on if persists;

              2014-01-30

              tbroch, jrbarnette

              • crbug.com/339573 failed to uprev chrome 34.0.1813 due to proto change for LocalExtensionCache::CacheItemInfo::CacheItemInfo
                • contacted chrome gardner (harrym) to resolve
              • crbug.com/310783 stumpy canary. 
              • crbug.com/339135 leon/samus/link/panther canary failures for vm_test fix here.
              • crbug.com/338085 radvd ebuild failure ... fixed with clobber build.  Email triage by (jamescook, avi, xiyuan, achuith)

              posciak (TOK)

              2014-01-24/2014-01-27

              vapier

              2014-01-24/2014-01-27

              katierh, armansito

              • Jan 27 was full of clobbering...
                • Needed a few reverts due to bad eclass CLs landing - and then lots of clobbering, removing prebuilts, etc to get the tree in a sane shape - crosbug.com/338085
              • Jan 24 had a number of failures due to Gaia corp errors...

              2014-01-23

              derat, wiley, dparker

              • crbug.com/337490 failure in build image: mount(8) failed: Device or resource busy
              • Timeouts in login_CryptohomeTelemetry. Fixed by this revert.
              • PFQ failures with "TEST_NA: Unsatisfiable DEPENDENCIES" caused by a server dying in the lab (per scottz@).
              • CQ fails in unittests due to timeout in chromite: crbug.com/337602

              2014-01-22

              derat, wiley, dparker

              • crbug.com/336742: all PFQs failed due to factory-test-init and chromeos-test-init conflict (see 2014-01-20). forced rebuilds
              • crbug.com/334958: two CertificateManagerBrowserTest tests failing on "Linux ChromiumOS Tests (dbg)(2)" builder
              • BVT failure on pit caused by kernel crashes: crbug.com/336839
              • BVT failure on ZGB caused by flake in network_DhcpStaticIP: crbug.com/336767

              2014-01-20

              reveman, pprabhu, dgreid

              • crbug.com/335978: security_ptraceRestrictions failed due to test_image update. Fixed by this revert.
              • factorytest-init and chromeos-test-init package conflict. Fixed by this revert.
              • crbug.com/336296: Arm canaries and pfq were broken. It was mostly a chrome issue, fix had already made its way to ToT pfq builders. TODO(sheriff): Make sure that the nightly-pfq picks up this change. Essentially, make sure that nightly-pfq has a green run.
              • crbug.com/336634: We didn't have enough daisy_spring DUTs in the lab, so HWTest timed on ChromePFQ a couple times.

              2014-01-16 and 2014-01-17

              bfreed, snanda, ellyjones

              2014-01-10 and 2014-01-13

              bleung, dbasehore, spang

              • crbug.com/333310: Node.js issue with downloading Chrome. Caused all canary builders and full builders to fail.
              • crbug.com/333398: Delay between ebuild commit and uprev commit
              • crbug.com/332645: beltino canary failed in archive, might be low memory issue.

              2014-01-06 and 2014-01-07

              jsalz (TPE), dlaurie, grundler

              • crbug.com/332104: Recurring issue in ManifestVersionedSync step on several builders (zgb, falco, peppy, stout) and experimental builders.  This is the top item for tree closure on 1/7.
              • crbug.com/332145: Chrome PFQ nightly failing to compile.  Fixed in chrome already, should be good for tomorrow's build.
              • crbug.com/327651: autopdate_EndToEndTest failure
              • crbug.com/329248: "Update failed" in VMTest
              • daisy_incremental out of space
              • crbug.com/327388: experimental_platform_RebootAfterUpdate suspected of putting machines in Repair Failed state.
              • crbug.com/331176: login_CryptohomeIncognitoTelemetry suspected of putting machines in Repair Failed state.
              • crbug.com/328360: audiovideo_VDA failed
              • crbug.com/331318: "Suite prep" failure - probable bvt timeout [update: fixed on parrot_ivb]
              • crbug.com/331754: platform_FilePerms: "/dev/pts" is missing options "set(['mode=620', 'gid=5'])"
              • crbug.com/324907: GerritHelperTest unit test failure
              • crbug.com/331756: UploadPrebuilts fails with CommandException: Invalid canned ACL

              2014-01-02 and 2014-01-03

              djkurtz (TPE)

              • crbug.com/329777 - autoupdate_CatchBadSignatures hash failure
              • crbug.com/317309 - "daisy canary" - autoupdate_Rollback failed failed to find a job_repo_url for the given host
              • crbug.com/331318 - "parrot canary" - time out during bvt 

              2013-12-25 and 2013-12-26

              hungte (TPE)

              2013-12-23 and 2013-12-24

              jcliang (TPE)

              2013-12-19 and 2013-12-20

                cywang (TPE), gabeblack, zork

                2013-12-17 and 2013-12-18

                waihong (TPE), shawnn, charliemooney

                2013-12-09 and 2013-12-10

                sabercrombie, milleral, miletus, mtennant (Chrome OS build deputy), rginda (Chrome gardner)

                • crbug.com/327005 - chromeos-base/telemetry failed on chrome_pfq, multiple platforms
                • crbug.com/327007 - Parrot canary failed because of chromeos-chrome build failed

                2013-12-5 and 2013-12-6

                dbasehore, pstew, seanpaul, tengs (chrome)

                2013-12-3 and 2013-12-4

                reinauer, benchan, josephsih, jamescook (chrome)
                • crbug.com/324872 - repeated hwtest timeout on daisy_spring
                • crbug.com/325056 TestFailure on HWTest [bvt]: network_DefaultProfileCreation: Missing setting CheckPortalList=ethernet,wifi,cellular
                • crbug.com/212879 - lumpy chrome pfq failing (SimpleTestUpdateAndVerify fails, system doesn't come up after reboot)
                • crbug.com/325610 - devinstall_test failed with KeyboardInterrupt: SIGINT received in VMTest x86-alex canary 
                • crbug.com/325617 - chromite unitest failure on samus canary
                • crbug.com/325629 - chromite unitest gerrit_unittest failure with KeyError: 'http' on amd64-generic full and x86-mario. szager had submitted a patch to fix this problem.
                • crbug.com/325632 - ALL bvt tests failed. I believed they were caused by the same reason. Merged all the other auto-filed issues to this one to track this bug.

                2013-12-2 (and 11-29 - holiday)

                skuhne, dkrahn, rspangler, kinaba
                • See several timeout problems (auto update, VMTest) and investigating. re-run builder with clobber
                • crbug.com/212879 - lumpy chrome pfq failing
                • crbug.com/319997 - daisy_spring canary hwtests failing b/c dut not coming back after reboot
                • crbug.com/324872 - filed this bug to track repeated hwtest timeout instances on daisy_spring
                • crbug.com/324907 - filed this bug to track repeated chromite unittest failure on x86-mario canary
                • crbug.com/324916 - filed this bug to track repeated perf (benchmark tests) failures on lumpy, parrot, daisy, ..
                • crbug.com/317903 - closed the tree because lumpy paladin will not pass until this is fixed -- update: temporary fix here and tree reopened


                OLDER ENTRIES MOVED TO THE ARCHIVE so this page doesn't take forever to load.  See Sheriff Log: Chromium OS (ARCHIVE!)

                Comments