I’m working on some analysis using Trap records data (from Reports section of the projects), but I haven’t managed to find any sort of data dictionary for the variables included in the dataset.
Can someone please clarify the difference between “trap.nid” and “nid”?
I’ve found cases where multiple entries have been recorded for the same “trap.nid” at the exact same date and time and coordinates (but with different values of “nid”), where one entry records a strike (possum) and the other does not. Does it make sense to aggregate these entries as one sampling event for that given trap, or does the “nid” indicate that there are multiple traps at the trap site and they should be treated separately?
Hi Sophie,
The trap.nid is the unique id of the parent trap the record belongs to. While the nid column is the unique id of the record itself - we really should be calling this column the record nid!
As you suspect, multiple entries recorded at the same trap.nid at the same date time suggests that there were multiple traps stored in this one trap site. But it doesn’t guarantee it - other less likely scenarios is that someone may have double entered a trap record.
A good guide here would be to check the code column, if the record has the same trap nid and date / time but a different code then this record is from a secondary trap at the same location.
Thanks very much for your quick and very helpful reply!
Just to double check I understand correctly, if multiple entries differ in nid but match on code (and trap.nid and date/time), they are a double entry rather than a secondary trap? This is true for the 158 sets of multiple entries in the dataset I’m using. Of these, in 1 case both entries record a possum strike, in all other cases one entry records a strike and the other does not.
How about if two entries match on code, trap.nid and date/time, but differ in status (e.g. one entry is “Sprung” and the other “Still set, bait bad”)?
How would you suggest I proceed with cleaning (and maybe aggregating) these data?
Yup, this would normally be the case. 158 sets though is a surprising high number for double entries - Especially since not all the statuses match.
I can also confirm that none of the four projects you are part of appear to use supplementary traps (what Trap.nz calls secondary traps). So I think there is definitely something going on here.
My advice here is to check with your team members on the ground recording these. They’ll know for sure if these are double entries, and from there what records you can ignore, delete or aggregate