1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. Only registered members can see all the forums - if you've received an invitation to join (it'll be on your My Summary page) please register NOW!

  3. If you're looking for the LostCousins site please click the logo in the top left corner - these forums are for existing LostCousins members only.
  4. This is the LostCousins Forum. If you were looking for the LostCousins website simply click the logo at the top left.
  5. It's easier than ever before to check your entries from the 1881 Census - more details here

Mistakes and statistics

Discussion in 'Comments on the latest newsletter' started by Jeremy Wilkes, Apr 8, 2020.

  1. canadianbeth

    canadianbeth LostCousins Star

    Obviously, I have not yet found the problem and I am about to tear out my hair. There are eight people listed in Ancestry. Two are on a separate page in FMP, so I deleted them and made a separate entry. Another, for some unknown reason, is not listed at all on FMP, even though she was only 2 years old. So I deleted her. And I *still* have the red marks beside the top two names, both of which match the FMP entry. I refreshed the page and then logged out of LC and back in again, hoping that would make them go away, but no. I do not know what else to do.
     
  2. jorghes

    jorghes LostCousins Superstar

    Go into each person and there should be an option to confirm that the data is correct, and save them again.
     
  3. Bryman

    Bryman LostCousins Megastar

    You may well have done all that you can do. The red marks indicate that there are other entries which don't quite match, such as name or YoB with slight difference(s). If they were submitted by another LC member then you will just have to wait for them to make appropriate changes. You will not be informed who that other member is. Sometimes they will also confirm that their entries are correct without any corrections and then the red marks will disappear without a match being made.
     
  4. Helen7

    Helen7 LostCousins Superstar

    Have you checked that the entries match the transcription at Ancestry, and that you have the correct reference for the head of household for all the entries (including those over the page)? You should use the names and ages as transcribed at Ancestry (even if they are wrong) as that is the 'gold standard' as explained by Peter in another thread. If FMP have corrected the names/ages (and/or have a different page for the younger household members), they wouldn't match and that may well be the cause of your red marks.
     
  5. canadianbeth

    canadianbeth LostCousins Star

    I used the entries from Ancestry originally and that is when I got the two red ! next to the first two names. Because they did not match FMP I thought I had to change them, so I did. I still have the two ! marks. Since the Ancestry entry is incorrect, and changing it to the FMP one is also incorrect, I think I will just delete the whole entry. There was no match anyway; I have no matches to any of the entries for that branch of the family.
     
  6. canadianbeth

    canadianbeth LostCousins Star

    Since Peter's newsletter about mistakes and statistics, I have added 179 more names to my ancestors. (and found four distant cousins who died in WW1; I know there are a few others as well that I found earlier) I am sure that most of you have done better but I am still missing a whole branch of my tree. Somewhere in another post, I gave my match potential; cannot remember it now but it is now 3.0935. I have two new names that I need to contact; one on my Dad's maternal side and one on my Mother's maternal side. :) I need to think about just what to say when I write.
     
    Last edited: Apr 14, 2020
    • Thanks! Thanks! x 1
  7. peter

    peter Administrator Staff Member

    Yes.
     
  8. peter

    peter Administrator Staff Member

    Fuzzy-matching only looks at names, not ages. Ages have to match precisely.
    They may well be matching with each other if they are the same age and the forenames begin with the same letter. The red exclamation marks won't disappear until you Confirm your entries (see Help info) and click Search.

    Please remember that the exclamation marks are merely a prompt to check the entries carefully against the census - they are NOT telling you that your entry is wrong. We don't have a copy of the census to check against (but you do!).
     
    • Thanks! Thanks! x 1
  9. Bob Spiers

    Bob Spiers LostCousins Superstar

    I have read this posting and whilst I understand the general principles discussed and explanations offered -particularly those given by Bryman who explains things well - I'm afraid taking it all in just makes my head hurt and 'mother is it worth it' springs to mind. Surely I am not the only one who thinks along those lines?

    It is the principle of computer 'Boolean' logic (to compare like with like) for things to be 'true or false'. This is why 'fuzzy logic' was bought into play to aid searching in Ancestry, FMP and the like by ticking, for example, 'Soundex' or 'Similar' boxes, or substituting * or ? for letter replacements. With such options, searching would be far harder and more laborious, as we all know.

    Which is why I think it a great shame that something similar cannot be applied for LC comparison matching. It is simple enough to check census refs when entering manually in small batches, or multi matches with a break in between, and next to impossible if entered in any great volume via FTA, as much as I love the program. I used it once, but never again for that purpose.

    I thought the same and only now, after reading what additional checks need to be made, am I disillusioned such are necessary, and sad that it is all down to the individual. Surely it is time for 'fuzzy' logic to come to the rescue?
     
  10. peter

    peter Administrator Staff Member

    LostCousins is all about accuracy - if members are careless they may well miss matches. You reap what you sow.

    When I introduced the grey arrows a few years ago I described them as being for reference checking., because up to that point at least 99% of all missed matches were the result of incorrect census references. It is only since the automated uploading feature was added to FTAnalyzer that incorrect ages have been a problem.
     
  11. Bryman

    Bryman LostCousins Megastar

    It all depends on how each individual prefers to work. Some people do not wish to break off from checking census records when they think that they have hit the gold vein. I prefer to enter households as I find them to make it less likely that I will forget to enter them at LC. I then use FTA periodically to confirm that nothing has slipped through the net.

    I think that Peter has got things about right for these matches. The reference must match exactly in order to make sure that the right household is being compared. I now know that the age (YoB) must also match exactly or there could be confusion between two like-named members of the household. I had previously thought that fuzzy logic would apply for differences of a year or two. The fuzzy logic applied to the names allows for misinterpreted handwriting and gives a second chance for near matches to be re-evaluated. My main improvement wish would be for near matches to remain identified somehow when the two parties involved are unable to agree how to make the matches complete, ie when both think that their entry is correct. If the references entered are the same then it is unlikely that the household members are different individuals.

    FTA can be a great time advantage when there are many (hundreds) of entries to submit, especially if the census data has not varied between censuses. Members using this method must realize that submission is the easy part and all results must be checked later because a tree can only contain one value for each field whereas the census can have slight variations between years. However, when there are relatively few entries to be added then I believe that a manual approach is better and gets the full attention of the user to enter the correct values.
     
  12. Alexander Bisset

    Alexander Bisset Administrator Staff Member

    Yup I did want to add an automated verification system by comparing the checked data on the Lost Cousins website with the GEDCOM data. However I needed a means of working out if someone had checked a record or not at present there doesn't seem to be any method on the Lost Cousins site of verifying if you have already checked the entry or not. I suggested to Peter this would be an extremely useful addition to users as they would then be able to see if they'd already checked an entry. However this was shot down in flames as putting members who had entered a lot of data at a disadvantage as they'd have a lot of un-verified entries.

    To be honest I couldn't see anything but positives. If someone is taking the time to enter all those records the piece of mind of seeing you've checked them and being able to visually see you've checked them is surely a good idea. It would massively improve the quality of the data to know what percentage of records had been checked and what percentage hadn't.

    Sadly without a method of knowing what records had been checked and which hadn't I couldn't do any verification. In theory it should be possible to entirely automate the reading of the data from the Lost Cousins site, following the link to the FindMyPast records and verifying the data matches all without human intervention. The program could then show what had been mismatched. The problem is you really, really don't want an automated system continually telling you things you've already fixed. So having a flag to indicate it shouldn't re-check a record is vital.

    I've still got the code ready to automate this checking in FTAnalyzer and it's pretty much good to go to fully check everyone's records are right. It just needs the last little missing link that will help improve the quality of the data on the site and give an actual measure of the level of checking.
     
  13. Alexander Bisset

    Alexander Bisset Administrator Staff Member

    To me the best of all worlds would be automate updating your Lost Cousins page then automate checking it was all correct by searching the links to Find My Past and reporting those that don't exactly match to the end user. This would eliminate the current issues people have pointed out.

    The less work people have to do the more likely they are to take those small steps to fix the highlighted inconsistencies. To trawl thorough all of their records when 90% of them are likely to be right is off putting and sadly that's what reduces the accuracy of the database overall.
     
    • Agree Agree x 1
  14. peter

    peter Administrator Staff Member

    I agree - it's better that members who are able to enter more relatives focus on that, rather than worrying about the small possibility that one of their entries isn't quite right. The arrows are there primarily so that members who are worried that they-re entering the wrong census references can check them out at the time of input.

    When I reminded members about the grey arrows in January I suggested spending 5 minutes checking entries - I was horrified to hear that someone had spent hours doing it.
     
  15. Tim

    Tim Megastar and Moderator Staff Member

    Peter, while we're talking about manual checking, have you considered adding an Age Field? Or replacing the Born Field with Age?

    When we enter the data into the household form, we add the age and not date of birth. When we add a corrected date of birth this is then used and displayed, which means I have to open each member of the household to check if the age I entered is the same as is on the transcriptions.
     
    • Agree Agree x 1
  16. peter

    peter Administrator Staff Member

    When you click the checking arrow the Search results show birth years, not ages - so I don't see that it would help to show ages on the My Ancestors page. (If you look at the household transcript it shows both ages and birth years for the England & Wales censuses.)

    What would be helpful is to be able to suppress corrections, the information that appears in italics. This is something I've been looking into, but so far I haven't been able to achieve it - it's always difficult working with somebody else's code, and it's especially difficult when the code is a mixture of several different programming languages.
     
  17. Tim

    Tim Megastar and Moderator Staff Member

    But as you've pointed out before, you have to click on the transcription to see the whole household which shows the ages that were recorded.
    Yes, suppressing the corrected age and showing the age that you've calculated from the age that was entered is also an option.
     
  18. peter

    peter Administrator Staff Member

    You don't have to view the household transcript - for a start, unless you're a Findmypast subscriber you can only do that for the 1881 Census, and even then it's only necessary when the household is split over two census pages, which only happens now and again.

    Even if you do view the household it shows both the age and the birth year. So showing the age instead of the birth year on the My Ancestors page wouldn't help unless you were checking against the census image, which isn't going to happen very often.
     
  19. Bob Spiers

    Bob Spiers LostCousins Superstar

    Why is it I understand what Tim asks but not the gist of your explanation. I reserve viewing the transcript only for checking on red ! queries, not for checking estimated birth years equate to actual ages entered on LC when carrying out a census reference check. So (staying with the 1881 Census) if someone shown as 12 equates to 1869, and a quick mental check does not disagree, that should be the end of the matter. Yet, as Tim alludes, one can only be certain by checking the transcript for each household member, and that is a step too far.
     
  20. peter

    peter Administrator Staff Member

    I don't quite understand what you're saying, Bob, so I'll restate my comments in a different way.

    How should you check your entries against the census? Click the grey arrow then compare what you see on your My Ancestors page against the Search results. (Both show year of birth - neither shows age.)

    Are there occasions when it is necessary to view the entries themselves? Yes, but they are fairly few and far between. It's most likely to be necessary when you've entered corrections, in which case they'll be shown in italics on your My Ancestors page - if you've made a correction then you have to look at your entry to see what the information is in the first part of the form. (Corrections are optional - if you don't make corrections it's not only quicker to enter the data in the first place, it's also quicker to check it at a later stage. But I am trying to find a way of suppressing corrections to make checking easier.)

    The other likely situation is a household split onto two census pages (this only applies to the 1841/1881 censuses). In this case the Search results will only show the entries from the first page, so if you want to check the entire household you need to view the household transcript (which is free for the 1881 Census).

    The third case in which you might think it's necessary to look closer is when the names you've entered from 1841/1911 don't tally with the transcript. But in most cases it ISN'T necessary, because you'll know instinctively if the transcript is wrong. For example, if your entry reads Arthur William Jakes and the transcript reads Arthur Wm James it's likely you have erroneously expanded the middle name, but got the surname right (since a householder is unlikely to have misspelled his own surname).

    In general the process of checking is intended to be quick and easy - if it takes more than 20 seconds per household on average there's something wrong somewhere. And remember, the best time to check is when you're entering the data in the first place.
     

Share This Page