Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accept arbitrary urls #110

Open
drahnr opened this issue Sep 10, 2020 · 6 comments
Open

accept arbitrary urls #110

drahnr opened this issue Sep 10, 2020 · 6 comments
Assignees
Labels
bug Something isn't working checker / hunspell hunspell checker related topics checker generic checker topics good first issue 🔰 Good for newcomers

Comments

@drahnr
Copy link
Owner

drahnr commented Sep 10, 2020

Describe the bug

Alt image text is checked, but there we should be a bit more lax and allow arbitrary urls or until #44 the alt image text should be ignored. Normally this is never shown anyways.

To Reproduce

Steps to reproduce the behaviour:

  1. A file containing [![crates.io](https://img.shields.io/crates/v/cargo_spellcheck.svg)](https://crates.io/crates/cargo-spellcheck)
  2. Run cargo spellcheck README.md

Expected behavior

Please complete the following information:

  • System: Fedora
  • Obtained: git

Additional context

This is an issue with the tokenizer too, it splits on . tokens. But we could whitelist valid urls in alt image names.

Note that this can most likely not be solved by a user dictionary, since it will never see the full url, but only subtokens.

@drahnr drahnr added the bug Something isn't working label Sep 10, 2020
@drahnr drahnr assigned drahnr and unassigned drahnr Sep 10, 2020
@drahnr drahnr added checker generic checker topics checker / hunspell hunspell checker related topics good first issue 🔰 Good for newcomers hacktoberfest labels Sep 10, 2020
@drahnr
Copy link
Owner Author

drahnr commented Sep 28, 2020

Since this has so much overlap with #44, I am assigning this to you as well - maybe an additional test case will already be enough given your work on #113

@laysauchoa
Copy link
Collaborator

laysauchoa commented Sep 29, 2020

it makes sense, thanks! So do you mean that the altext should not be corrected in case is followed by a link? or that the link should not be corrected?

This is what it looks in this branch: laysa-cmark-links-handled

error: spellcheck(Hunspell)
  --> /home/tmhdev/Documents/cargo-spellcheck/README.md:2
   |
 2 | ![altext](filenamve)
   |   ^^^^^^
   | - alt ext, alt-ext, textual, textural, external, or exalt
   |
   |   Possible spelling mistake found.

error: spellcheck(Hunspell)
  --> /home/tmhdev/Documents/cargo-spellcheck/README.md:3
   |
 3 | [![crates.io](https://img.shields.io/crates/v/cargo_spellcheck.svg)](https://crates.io/crates/cargo-spellcheck)
   |           ^^
   | - oi, Io, ii, ion, bio, Rio, is, or one of 7 others
   |
   |   Possible spelling mistake found.

@drahnr
Copy link
Owner Author

drahnr commented Sep 29, 2020

I think the alt text should be checked if it is an url, if so, just accept it. If not, check the text like any other. I think that makes most sense. Usually alt text contains only a few key words or an url.

@laysauchoa
Copy link
Collaborator

For the example,
[![crates.io](https://img.shields.io/crates/v/cargo_spellcheck. svg)] (https://crates.io/crates/cargo-spellcheck)
What should be checked in this case?highlighted in negrito.

Ignore that I added some spaces just to avoid showing images or make links.

@drahnr
Copy link
Owner Author

drahnr commented Oct 6, 2020

I would expect that crates.io is checked if it is a valid url, and if so it is skipped to be spell checked.

Additionally, if requested by the user, both urls should be checked for existence https://img.shields.io/crates/v/cargo_spellcheck.svg https://crates.io/crates/cargo-spellcheck ).

Does that make sense?

@drahnr
Copy link
Owner Author

drahnr commented Nov 11, 2020

To speed this up, we could borrow some code from https://github.com/deadlinks/cargo-deadlinks , which works ontop of html files. We could re-use the link check code, and apply it directly on the source, without having to deal with the html parsing. Just an idea though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working checker / hunspell hunspell checker related topics checker generic checker topics good first issue 🔰 Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants