You can Free download it to your computer with light steps. Bcher fremdsprachig whlen sie die abteilung aus in der sie suchen mchten cnim 1 2. You can Free download it to your computer through easy steps. Edition ebook. Not in CoL Identification Portal as of Feb Protozoa Orbulina universa 55, 21 It is therefore reasonable to extrapolate that: a large portion of the gaps identified in Table 4 will in the future be resolved with newest versions of the taxonomical authorities used to build the GBIF taxonomic backbone.

The rate of resolved names should in principle directly be correlated with the growth in volume of the taxonomic authoritative references used by GBIF. Table 6. For example, correcting the wrong assignment of 90 species from the Kingdom Plantae to Animalia will impact more than 1,3 million occurrences within the GBIF Index as of February On the other hand correction of the wrong assignments to Animalia of 26 species will only affect 1, occurrences. Similar breakdowns are provided for Phylum Table 6. This table shows that the effort in correcting misidentifications at a high taxonomical rank e.

Kingdom will impact a limited number of occurrences 1,, representing less than 0. Costs in verifying such misidentification should be taken into consideration during data cleansing activities. Such analysis should in particular assess the amount of misidentifications e.

GBIF should also improve its reporting services to the original publishers so that potential taxonomical misidentifications are known. In addition, GBIF should provide means to assess the effectiveness of its taxonomical names resolution services used during the harvesting and indexing processes. All taxa misidentifications should be documented and calls to expert groups e. Geospatial During the harvesting and indexing routines, these geo-referenced occurrences are checked in particular for wrong assignments e. In the context of this study, we considered geo-referenced occurrences as a record in the GBIF Index with the latitude and longitude within the earth-bounding box i.

This amounted to This includes a substantial number of records being reported as 0. This would greatly facilitate the validation of geo-referenced occurrences during the harvesting and indexing routines. By end of , This percentage was lower As shown in Figure 4, the rate of geo- referenced records is increasing over time. For older occurrences the rate of georeferencing is decreasing substantially.

Figure 4. Evolution of the percentage of geo-referenced records During the harvesting and indexing procedures, a series of verifications on geospatial fields e. In February , we estimated that less than 3. Taking into consideration that GBIFS is not mandated to apply corrections to the original published occurrence records, these records with possible wrong coordinates are therefore only flagged during the harvesting and indexing routines.

While this addresses partly the problem, it is important to note that the verification and correction of the original occurrences records lies with the publishers. The data curators can therefore enrich their database content, increase the quality and accuracy of the content mobilised through GBIF and thus makes it suitable for wider uses. The high percentage of georeferenced records within the GBIF Index as well as the observed positive improvements in our two assessments is an important quality stamp of the GBIF mobilised data.

Indicators Setting an ideal target for the rate of georeferenced occurrences within the GBIF index is a difficult task. Some voucher specimen collections have in general a lower percentage of georeferenced records compared to recent field observation records. The publisher community is therefore addressing this challenge in particular for recent records.

While GBIFs role is to enable the discovery of records published, its role is not to undertake modifications to the original data published. While this can be seen as a limitation, one way forward would be to set targets by periods such as , , and today.

Experts could investigate datasets falling under these baselines. Expert curation and reports with recommendations on possible corrections should be sent to the original publishers to actively promote constant data quality improvements. Temporal As detailed in Table 7, The breakdown provided in this analysis shows that 4. However, the comparison between raw data and processed data uncovered some issues on date processing, such as mismatches between the published and interpreted date stamp.

For example, 8. In addition, 5. More details about this mismatch can be found in Otegui et al. In , these preliminary findings were taken into account by the GBIFS and existing processes to interpret date stamp at the publisher level were reviewed and improved. The Table 7 shows also the comparison between the assessment made in December and February Most of these improvements are mostly due to improved interpretation of malformed data stamp information in the published resources during the harvesting and indexing routines.

December February Difference Occurrences with no year provided 82,, 42,, RAW refers to records as supplied by the publisher, whereas OCC indicates records available through the portal after processing. See text. The year information is the most important element within temporal date stamp information.

However, the month and day elements provides additional accuracy in particular when looking at migratory species moving for example from feeding to reproduction areas during the same year.

Partial date, as found on many older specimens may be useful for one or the other of these purposes even if they cannot serve all needs. Such gaps in the temporal attributes are a limitation for certain types of analysis, such as population cycles or changes in migration patterns related to climate change. However, alone the low percentage of occurrence records without temporal information However combined with other parameters like geo-referencing, it could become a serious limitation for scientists in particular when dealing with analysis requiring the combination of these e. This total represents , This also indicates that Breakdown of the temporal and geospatial data availability With year Without year Total This represents Although this is an improved figure compared to the In , GBIFS has greatly improved its harvesting and indexing processes in order to optimize its ability to interpret as accurately as possible the information of publishers.

In February , the taxonomical backbone was greatly improved and the indexing processes fine-tuned. This has led to a lower percentage While these data quality trends are promising Figure 5 , they are mostly due to technical improvements in the GBIF IT infrastructure and much more efforts are required at the level of the data publishers within the GBIF community.

Collection curators should be encouraged to explore ways to improve the quality of the published information in particular for three dimensions namely: taxonomical, temporal and geospatial. Many tools are aimed at helping curators to identify possible errors and to standardise data in accordance with authoritative references. Such situation could happen for example when the same dataset is published more than one time through GBIF. Comparing datasets on criteria like taxonomy, temporal and geospatial information can easily identify these cases. To assess these cases, we assumed that a duplicate record would be identified when the values respectively for taxonomical species id , temporal timestamp date e.

Based on this assumption, we calculated in February the total amount of potential duplicates between resources. The results are summarized in Table We have identified 42 combinations of datasets have been identified with at least , potential duplicate occurrences representing a total of more than 30 million occurrences. This represents more than 9.

Some of these potential duplicates are summarized in Table 11 for the top 20 combinations. In all cases e. What appears very surprising is that most of these potential duplicated resources were registered with very similar names e. When a new resource is registered, a simple text comparison between the title of the new resource with existing published ones would have enabled rapid identification of obvious possible duplication. This has never been implemented up to now in GBIF but efforts are underway to automate this process as well as to resolve the already identified potential duplicates in close communication with the respective GBIF publishers.

Table DOI , with proper metadata, would have been a much more robust solution. A similar assessment in February Figure 6.

Between these two assessments the GBIF taxonomical backbone was reviewed with the latest version of the Catalogue of Life. Data records by Kingdom a, left: Dec ; b, right: Feb Figure 7. Data records by Phylum a, left: Dec ; b, right: Feb Figure 8: Data records by Class a, left: Dec ; b, right: Feb A breakdown at the Class rank Figure 8.