Skip to content
This repository was archived by the owner on Apr 4, 2021. It is now read-only.

Revert "Bump github.com/PuerkitoBio/goquery from 1.5.1 to 1.6.0" #142

Conversation

syou6162
Copy link
Owner

@syou6162 syou6162 commented Dec 17, 2020

Reverts #130

panicを返すようになった...ひとまず凌ぐ。recoverで拾うか、元のライブラリにprを送るか。


Fetching(16): https://note.com/cd/sessions?redirect_to=https%3A%2F%2Fcomemo.nikkei.com%2Fn%2Fnb42d532e4e44&m=AwVa1yXCeWzj1OuEgoVDWQ%2BYC7ULgaKcrYoToF%2ByJlcgE1JpYZyc%2Fi5AU6Q8kbEW
--
Fetching(4): https://t.co/EjFW6
Fetching(7): https://t.co/7fqhdm
Fetching(8): https://www.itmedia.co.jp/business/articles/2012/17/news029.html#utm_term=share_sp
Fetching(14): https://www.itmedia.co.jp/news/articles/2012/02/news094.html
Fetching(11): https://t.co/ucxsTI
Fetching(12): https://odaibako.net/detail/request/2469d702-f212-4a49-9883-ed707f24df23?card
Fetching(9): https://t.co/gUrV4G
Fetching(2): https://onezero.medium.com/the-long-forgotten-story-of-ben-gardiner-the-aids-activist-whose-network-transformed-the-internet-c14460a73165?source=social---tw.mediumtw&gi=sd
Fetching(10): https://arxiv.org/pdf/2002.08791.pdf
Fetching(6): https://ow.ly/nWwa30rkBLu
Fetching(3): https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2681169
Fetching(13): https://news.livedoor.com/article/detail/19344343/
Fetching(1): https://www.itmedia.co.jp/news/articles/2012/09/news006.html#utm_term=share_sp
Fetching(5): https://t.co/tEEpK
2020/12/17 17:16:18 404 Not Found: Cannot fetch https://t.co/EjFW6
Fetching(15): https://www.itmedia.co.jp/news/articles/2012/08/news132.html
2020/12/17 17:16:18 404 Not Found: Cannot fetch https://t.co/7fqhdm
2020/12/17 17:16:18 404 Not Found: Cannot fetch https://t.co/gUrV4G
2020/12/17 17:16:18 404 Not Found: Cannot fetch https://t.co/ucxsTI
2020/12/17 17:16:18 404 Not Found: Cannot fetch https://t.co/tEEpK
2020/12/17 17:16:18 Invalid utf8 document: https://www.itmedia.co.jp/business/articles/2012/17/news029.html#utm_term=share_sp
2020/12/17 17:16:18 Invalid utf8 document: https://www.itmedia.co.jp/news/articles/2012/02/news094.html
2020/12/17 17:16:19 Get https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2681169: read tcp 10.0.0.10:33168->173.254.190.147:443: read: connection reset by peer
2020/12/17 17:16:19 Invalid utf8 document: https://www.itmedia.co.jp/news/articles/2012/08/news132.html
2020/12/17 17:16:19 Invalid utf8 document: https://www.itmedia.co.jp/news/articles/2012/09/news006.html#utm_term=share_sp
2020/12/17 17:16:19 Get https://ow.ly/nWwa30rkBLu: dial tcp 54.183.130.144:443: connect: connection refused
2020/12/17 17:16:19 Invalid utf8 document: http://karapaia.com/archives/52297454.html
2020/12/17 17:16:19 Invalid utf8 document: https://news.livedoor.com/article/detail/19344343/
panic: goquery: failed to parse HTML: html: ParseFragment of non-element Node
goroutine 736 [running]:
github.com/PuerkitoBio/goquery.parseHtmlWithContext(0xc00d910f00, 0x7, 0xc0001d5a40, 0x5, 0x480aa60, 0x4540e00)
/Users/yasuhisa/go/pkg/mod/github.com/!puerkito!bio/goquery@v1.6.0/manipulation.go:579 +0x155
github.com/PuerkitoBio/goquery.(*Selection).eachNodeHtml(0xc003cfeab0, 0xc00d910f00, 0x7, 0x1, 0x405f150, 0x7)
/Users/yasuhisa/go/pkg/mod/github.com/!puerkito!bio/goquery@v1.6.0/manipulation.go:669 +0x1bd
github.com/PuerkitoBio/goquery.(*Selection).BeforeHtml(0xc003cfeab0, 0xc00d910f00, 0x7, 0x0)
/Users/yasuhisa/go/pkg/mod/github.com/!puerkito!bio/goquery@v1.6.0/manipulation.go:138 +0x50
github.com/syou6162/GoOse.(*Cleaner).convertDivsToParagraphs.func1(0x8, 0xc003cfe9c0)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/!go!ose@v0.0.0-20190108170554-09969ebeb09f/cleaner.go:553 +0x28c
github.com/PuerkitoBio/goquery.(*Selection).Each(0xc003cfe420, 0xc002f0d068, 0x4)
/Users/yasuhisa/go/pkg/mod/github.com/!puerkito!bio/goquery@v1.6.0/iteration.go:10 +0x53
github.com/syou6162/GoOse.(*Cleaner).convertDivsToParagraphs(0xc002f0d3e8, 0xc003704180, 0xb4ec09, 0x4, 0xc003704180)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/!go!ose@v0.0.0-20190108170554-09969ebeb09f/cleaner.go:495 +0xd8
github.com/syou6162/GoOse.(*Cleaner).Clean(0xc002f0d3e8, 0xc003704180, 0x3a882180)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/!go!ose@v0.0.0-20190108170554-09969ebeb09f/cleaner.go:301 +0x319
github.com/syou6162/GoOse.Crawler.Crawl(0xb4d608, 0x0, 0x1194, 0xb4d81e, 0x2, 0xb59f2b, 0x10, 0xb5ad59, 0x11, 0xb7862d, ...)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/!go!ose@v0.0.0-20190108170554-09969ebeb09f/crawler.go:195 +0x6bc
github.com/syou6162/GoOse.Goose.ExtractFromRawHTML(0xb4d608, 0x0, 0x1194, 0xb4d81e, 0x2, 0xb59f2b, 0x10, 0xb5ad59, 0x11, 0xb7862d, ...)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/!go!ose@v0.0.0-20190108170554-09969ebeb09f/goose.go:24 +0x142
github.com/syou6162/go-active-learning/lib/fetcher.GetArticle(0xc0000e06e0, 0xab, 0x0, 0x0, 0x0)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/go-active-learning@v0.4.0/lib/fetcher/fetcher.go:116 +0x52b
github.com/syou6162/go-active-learning/lib/service.fetchMetaData(0xc0003fa360, 0xc00000e020, 0xc00dd6bf98)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/go-active-learning@v0.4.0/lib/service/example.go:287 +0x50
github.com/syou6162/go-active-learning/lib/service.(*goActiveLearningApp).Fetch.func1(0xc00d906460, 0xc0002ddcb0, 0xc00d818660, 0xc0003fa360, 0x2)
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/go-active-learning@v0.4.0/lib/service/example.go:350 +0x1f1
created by github.com/syou6162/go-active-learning/lib/service.(*goActiveLearningApp).Fetch
/Users/yasuhisa/go/pkg/mod/github.com/syou6162/go-active-learning@v0.4.0/lib/service/example.go:342 +0x291
command exited with code: 2


@syou6162 syou6162 self-assigned this Dec 17, 2020
@syou6162 syou6162 merged commit 99f3a7b into master Dec 17, 2020
@syou6162 syou6162 deleted the revert-130-dependabot/go_modules/github.com/PuerkitoBio/goquery-1.6.0 branch December 17, 2020 17:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant