Introduction 

Oilseeds are the seed of many different types of plants that are used as a source for their vegetable oil.  They contain high concentrations of energy as well as moderate amounts of protein and fiber.  Soybeans are the most widely used oilseeds and the term “vegetable oil” is synonymous with soybean oil in the United States.  They account for over half of all oilseeds produced worldwide.  Other popular oilseeds are groundnuts/peanuts, sunflower, sesame, and rapeseed/canola. The meal from oilseeds is used as livestock feed and biofuel from oilseed oil is becoming more prevalent as awareness of environmental concerns and sustainable alternative energies increases.  Production and trade of oilseeds has increased in recent years, with many types reaching ten-year production records in 2020 aided by higher demand and improved crop yield per acre.  Different oilseeds vary in oil yield and production per crop acre, although the process for manufacturing oil, biofuel, and meal from oilseeds is similar for most types.  Final product quality is very dependent on processing and there are many nutritional parameters that must be monitored in oilseeds, such as oil, moisture, protein, carbohydrates, and fatty acids.  Other important quality components in oilseeds include flavor and aroma, seed viability, oil yield, and seed variation.  Seed variation can occur from a number of factors, including aging, geographical origin, and color.  Standard methods for measuring quality parameters of interest in oilseeds are often expensive, time-consuming, require the use of toxic chemicals and solvents, and require destruction of the tested sample.  As oilseeds continue to play a large role in the world food, animal feed, and biofuels market, there is a need to develop fast, non-invasive testing methods to meet the evolving challenges in producing quality oilseeds.  One such method which has been examined is NIR Spectroscopy.   

Analytes 

  • Oil content 
  • Moisture 
  • Protein 
  • Fatty acids 
  • Lipids 
  • Ash 
  • Carbohydrates 
  • Oleic acid 
  • Linoleic acid 
  • Erucic acid 
  • Amino acids 
  • Pearson correlation coefficient analysis between nutritional components determined by NIR spectroscopy and volatile compounds determined by GC-MS 
  • Oil yield of sesame seeds used for Traditional Aqueous Extraction Processing (TAEP) 
  • Classification of viable and non-viable castor seeds 
  • Geographical origin and age of Torreya grandis seeds 
  • Determining sesame seed variation based on origin and light/dark seeds 
  • Variation in major components of sesame seeds in different countries 

Summary of Published Papers, Articles, and Reference Materials 

Review of NIR Spectroscopy Methods for Nondestructive Quality Analysis of Oilseeds and Edible Oils 

Oilseeds and edible oils play a vital role in human health and are both consumed and produced on a significant level worldwide.  They are an important energy source for people and provide many important nutritional components, such as starch, protein, fatty acids, amino acids, vitamins, phytosterols, and polyphenols.   The quality of edible oils is dependent on the quality and proper processing of oilseeds and in recent years, there has been a significant increase in interest of the quality and safety of oilseeds.  With this increased interest, there is a need for fast, non-invasive, and cost-effective testing methods.  NIR spectroscopy has been used as a quality assessment tool for numerous parameters in both oilseeds and edible oils and here the different parameters, methods, and applications are examined. 

Oilseeds that have been used in NIR spectroscopic applications include soybeans, rapeseed, sesame, and peanuts.  Traditional testing methods for quality control are often expensive and time-consuming.  Some of these include Soxhlet extraction for oil content, Kjeldahl method for protein, titration for acid value, and GC or GC-MS for fatty acid composition.  Building a NIR method requires the scanning of samples with known reference values for the parameters of interest. Chemometric modeling is then used to correlate the NIR spectra to the parameters of interest.  Once models are created, they can use the NIR spectra to predict the modeled parameters. A distinct advantage of NIR spectroscopy is that a single spectrum scan can be used to predict multiple components as long as the chemometric models are created.  Listed below are some oilseed types and parameters that have used NIR spectroscopy for predicted values in various studies. 

Rapeseed Oil ContentVariance – 0.027 
Rapeseed Linoleic AcidNo statistics given
Erucic AcidNo statistics given
Rapeseed/Mustard Seed Oil ContentR² – 0.94
MoistureR² – 0.87
Protein R² – 0.91
Sesame ProteinSEP – 0.827% 
Soybean Crude ProteinNo statistics given
MoistureNo statistics given
Lipid No statistics given
AshNo statistics given
Carbohydrates RMSECV – 0.4% to 2.3% for all parameters
Soybean Oleic AcidR² > 0.91
Soybean Amino AcidsR² – 0.83-0.90 
Peanut Oil ContentNo statistics given
Fatty AcidsResidual % Deviation > 5
Peanut ProteinR² – 0.99

NIR spectroscopy has been studied for more specific nutritional components as well with mixed results, as the threshold of detection for compounds like tocopherols is often too low to use NIR as a suitable method.  However, NIR spectroscopy can be successful in creating prediction profiles and estimated values of low concentration components if parameters like protein and fatty acids can be shown to be correlated with the low concentration compounds.  One successful study did correlate glucosinolates , the component in pungent plants like mustard and cabbage, to NIR spectra of Rapeseed/Mustard seeds with an R² of 0.983.  Tracing of geographical origin of oilseeds is important as the quality can vary greatly for oilseeds that come from different areas.  Visual sorting can be difficult and impractical and due to the variance in nutritional content, NIR spectroscopy has been successful as a tool for determining geographical origin and adulteration detection in oilseeds.  As oilseed production and consumption continues to grow worldwide, NIR spectroscopy will continue to emerge as a fast, non-invasive, and cost-effective method for quality control of oilseeds as new applications are developed for its use.   

Review of NIR spectroscopy methods for nondestructive quality analysis of oilseeds and edible oils | Semantic Scholar 

Analysis of Peanut Using Near-Infrared Spectroscopy and Gas Chromatography–Mass Spectrometry: Correlation of Chemical Components and Volatile Compounds 

Peanuts contain protein, oil, oleic acid, and linoleic acid and flavor is largely determined by pyrazine and aldehyde compounds.  Both the nutritional value and flavor components are important quality control standards.  While the nutritional components are proven parameters that can be measured using NIR spectroscopy, flavor components have a concentration below the threshold for NIR correlation.  However, correlating detectable compounds to lower concentration compounds is a method for creating profiles that can be used to estimate the lower concentration components. In this study, NIR spectroscopy was examined for determining nutritional components in peanuts and GC-MS was used to isolate and determine volatile compounds with the purpose of identifying the major aroma components so that the nutritional components can correlated with flavor components.  This correlation can then enable the use of NIR spectroscopy to predict flavor components.  Twelve different peanut cultivars from China were procured for the study.  Peanuts were grown, dried in the sun until the moisture content was less than 10%, and then stored at 4°C.  Shelled peanuts were scanned using an FT-NIR spectrometer from 12000 cm-1 to 4000 cm-1 using about half a cup of sample in a spinning cup of 5 cm diameter.  Calibration models were developed using values from reference tests for the parameters of interest.  The range of values for the nutritional compounds in the peanut samples is shown below: 

Oleic Acid35.69 g to 82.79 g/100 g oil
Linoleic Acid2.92 g to 44.19 g/100 g oil 
Protein 26.97 g to 33.07 g/100 g raw materials 
Oil 45.53 g to 55.53 g/100 g raw materials 

Volatile compounds were extracted using headspace solid-phase micro-extraction.  Samples were analyzed in triplicate and were absorbed at the GC-MS injection port for analysis.  Identification and compound concentration were determined from a NIST mass spectra library search, the peak area of identified compounds, and an internal standard.  In total, fourteen flavor compounds were identified: six pyrazines, four aldehydes, two methyl pyrroles, maltphenol, 2,3-dihydro coumarone, and 4-vinyl-2-methoxy phenol.  The pyrazines and aldehydes are the main flavor compounds in roasted peanuts and 2,5-dimethyl pyrazine is most correlated with aroma in previous studies.  A Pearson correlation coefficient analysis was performed to determine the relationships between the four nutritional components and the fourteen volatile compounds.  The results showed close correlation between various compounds.  Pyrazine compounds showed a strong correlation with oleic acid. Aldehydes were incompletely positively correlated with six of the pyrazine compounds. While more work and more peanut varieties would be required before using a model like this in a practical setting, the results show the potential to analyze and develop the relationships between nutrients and flavor in peanuts.  The use of NIR spectroscopy in such a quality control model would enable manufacturers to develop simple tests that can predict the flavor of roasted peanuts based on the composition of raw peanuts.  Such a model would be hugely beneficial as NIR spectroscopy is far cheaper, faster, and easier to use than GC-MS and similar methods. 

Analysis of Peanut Using Near-Infrared Spectroscopy and Gas Chromatography–Mass Spectrometry: Correl (tandfonline.com) 

Rapid Determination of the Oil and Moisture Contents in Camellia gauchowensis Chang and Camellia semiserrata Chi Seeds Kernels by Near-infrared Reflectance Spectroscopy 

Camellia is a native plant to China and one of the most important sources of high quality edible plant oil.  It is widely grown with more than twelve million acres in production and annual production exceeds one hundred fifty million kilograms.  An essential cooking oil that contains high nutritional value, Camellia has unsaturated fatty acids ranging from 85% to 92% and a variety of other healthy components, such as Vitamin E, phytosterols, squalene, and flavonoids.  It is essential to analyze quality characteristics like oil and moisture at harvest and during processing.  Traditional methods for analyzing these parameters are expensive, time-consuming, and often require the use of toxic and volatile chemicals.  NIR spectroscopy was examined for the purpose of determining oil and moisture in Camellia seeds. One hundred ten samples of each of two separate varieties of Camellia seeds were procured for the study: Camellia gauchowensis Chang and Camellia semiserrata Chi seeds. Each individual sample weighed approximately two hundred grams.  In order to ensure that the models accounted for variability in samples and encompassed a large range for the parameters of interest, samples were selected from five separate growing regions.  Seeds were planted, collected upon ripening, dried, dehulled, and stored with proper ventilation and humidity.  Reference values were determined for oil using Soxhlet extraction and moisture using oven drying.  NIR spectra were collected from 950 nm to 1650 nm at a 5 nm scan interval. Each sample was scanned in triplicate and the three spectra were averaged into a single spectrum.  Various preprocessing methods were applied to the spectral data before chemometric modeling.  Principle Component Analysis (PCA) was performed for outlier analysis and to check variability in the spectra.  Partial Least Squares (PLS) calibration models were created for oil and moisture for the two separate varieties of seeds. Results are shown below. 

Camellia gauchowensis Chang 

Oil R² = 0.98 
Moisture R² = 0.92

Camellia semiserrata Chi 

Oil R² = 0.95 
Moisture R² = 0.89

The results of this study proved the feasibility of using NIR spectra and chemometric models to determine oil and moisture content in two separate varieties of Camellia seeds.  Correlation coefficients were high and considering the relative small sample set and different varieties used, the models showed the potential for developing NIR spectroscopy as a large scale quality control method for Camellia seeds.  Large-scale testing using traditional methods is impractical and NIR spectroscopy could be used as a tool to facilitate quality control as well as improve the economics of Camellia seed trading. 

Rapid Determination of the Oil and Moisture Contents in Camellia gauchowensis Chang and Camellia semiserrata Chi Seeds Kernels by Near-infrared Reflectance Spectroscopy – PubMed (nih.gov) 

Nondestructive On-Site Detection of Soybean Contents Based on An Electrothermal MEMS Fourier Transform Spectrometer 

Soybeans are a significant source of plant oils and proteins. They are widely used in both food and industry.  Fat, protein, and moisture are very important quality control parameters in soybeans.  Protein is especially important because the price of soybeans in international trade markets is often determined by protein content.  While traditional methods for determining these parameters are accurate, they are impractical for implementing on a large scale.  There is a need for a quick and accurate method for determining quality parameters in soybeans that can be used for on-site analysis.  In this study, a MEMS-FT portable spectrometer was used to determine the feasibility of determining fat, protein, and moisture in soybeans.  Three hundred and fifty-eight soybean samples from different growing areas were procured for the study.  Each sample weighed approximately two hundred grams.  Samples were scanned from 1000 nm to 2500 nm. A portion of each sample was loaded into the sampling cup and scanned three separate times.  The three spectra per sample were then averaged into a single spectrum.  Traditional reference tests were performed to determine reference values for fat, protein, and moisture.  Various preprocessing methods were performed on the spectral data before chemometric modeling.  80% of the samples were chosen as a calibration set for modeling.  Partial Least Squares (PLS) models were created correlating the NIR spectra to fat, protein, and moisture.  The remaining 20% of the samples were used as a validation set and independent predictions were performed used the spectra of these samples and the calibration models.   

Moisture Range – 5.1% to 11.9%R² = 0.92SEP – 0.31% 
Protein Range – 32.8% to 49.3%R² = 0.92SEP – 0.56% 
Fat Range – 15.1% to 22.3%R² = 0.85SEP – 0.69% 

Correlation is good for all three calibration models and the independent sample set predictions proved the validity of the models.  All predictions were within 3% error of the reference test values for all parameters.  Other statistical analysis and values offer further proof that the models are valid, such as high values for RPD.  This study showed that a portable MEMS-FT spectrometer can be used for accurate on-site testing of fat, protein, and moisture in soybeans in a fast, non-invasive manner that can be implemented on a large scale.   

Nondestructive On-Site Detection of Soybean Contents Based on An Electrothermal MEMS Fourier Transform Spectrometer | IEEE Journals & Magazine | IEEE Xplore 

Nondestructiv-1e Near-Infrared Reflectance Spectroscopic Analyses of the Major Constituents of Sesame(Sesamum indicum L.) Whole Seeds with Different Coat Color 

Sesame is an important oilseed plant and has gained considerable attention in Japan as an alternative crop to rice.  A need exists for a rapid testing of quality parameters in sesame seeds for breeding selection.  NIR spectroscopy was examined as a method for determining fat, oil, and moisture in sesame seeds and seeds of different colors were specifically chosen to determine if color affected NIR spectroscopic analysis.  Thirty different kinds of samples were collected by the Seed Bank Project of Japan International Cooperation Agency and fifty-two other samples of different varieties and breeding lines were obtained from the National Institute of Crop Science.  These samples included yellowish-brown, dark brown, black, and white coated seeds.  NIR spectra were collected from 1100 nm to 2500 nm at 2 nm intervals.  Each sample was scanned in three separate positions and the three spectra were averaged into a single spectrum per sample.  Traditional methods were used to determine reference values for fat, protein, and moisture.  Various preprocessing methods were applied to the spectral data before chemometric modeling.  Visual examination of the NIR spectra showed a marked difference between the spectra of the black seeds and the other three groups.  Partial Least Squares (PLS) calibration models were created correlating the NIR spectra to moisture, oil, and protein in the sesame seed samples. 

Moisture R² = 0.979SEP = 0.318%
Oil R² = 0.931SEP = 1.234% 
Protein R² = 0.939SEP = 0.830% 

The results of this study proved the feasibility of measuring moisture, oil, and protein in sesame seeds from NIR spectra and calibration models. Cross-validation was used for independent predictions and prediction results showed good correlation between the NIR method and reference values.  Most importantly, the different colors of the sesame seeds did not appear to have an effect on modeling results.  Results shown in this study are similar to other studies that used NIR spectroscopy to measure these parameters.  NIR spectroscopy can be used to determine major constituents in sesame seeds that can make a quick and non-invasive analysis of the seeds for breeding selection.  

Nondestructive Near-Infrared Refl ectance Spectroscopic Analyses of the Major Constituents of Sesame(Sesamum indicum L.) Whole Seeds with Different Coat Color: Plant Production Science: Vol 7, No 3 (tandfonline.com) 

Near-Infrared Spectroscopy Combined with Multivariate Calibration to Predict the Yield of Sesame Oil Produced by Traditional Aqueous Extraction Process 

Sesame is one of the main oil crops in China and Asia. It is cultivated in tropical and subtropical regions with over 7.5 million hectares of crops and annual seed yield exceeds over three million tons.  China accounts for approximately a quarter of all worldwide sesame production.  It is rich in many fatty acids including oleic, linoleic, palmitic, and stearic acids.  It is popular for health benefits, reported oxidative stability, and a unique and pleasant aroma and flavor which makes it recognized as a top-grade vegetable oil.  While mechanical pressing or solvent extraction are used for large scale production of sesame oil and these methods do obtain a high oil yield, they also degrade quality because of heat treatment. They denature the proteins in the meal as well which is used as feedstock for animals.  An alternative method for these oil extraction methods is Traditional Aqueous Extraction Process (TAEP), which grinds sesame seeds at low temperature and adds pure water to replace the oil from the sesame sauce.  While this method improves oil quality, it does reduce yield.  It is beneficial for sesame oil manufacturers to find high quality seeds when using TAEP, but no rapid and effective method that can be implemented on a large scale exists for this purpose.  In this study, NIR spectroscopy was used to develop multivariate calibration models that can predict TAEP oil yield from NIR spectra.  One hundred and forty-five sesame seed samples from nine different producing areas and multiple markets in China were procured for the study.  All seeds were harvested in the same year and consisted of five different colors. Black seeds were excluded from the study because they have a higher price and are rarely used for oil extraction.  Seeds were dried in the sun and stored under cool and dark conditions before scanning.  All seeds were scanned using an FT-NIR spectrometer from 12000 cm-1 to 4000 cm-1 at 4 cm-1 resolution and at a 1.929 cm-1 interval.  Sixty-four scans were collected per reading and averaged into one spectrum.  This process was repeated three times for each sample and the three spectra were averaged into a single spectrum per sample.  After NIR spectra were collected, all seed samples were extracted in an oil mill using the TAEP method.  Oil yield was calculated by dividing the net weight of oil obtained by TAEP by the net weight of sesame seed.  Various preprocessing methods were applied to the spectral data before chemometric modeling, including an algorithm to separate the spectra into a training and test set.  Least-Squares Support Vector Machines (LS-SVMs) calibration models were created using the pre-processed spectra and different wavenumber ranges to correlate the NIR spectra to oil yield.  Results are shown below.   

Oil YieldRMSEP = 1.15% w/w(Smoothing, SNV, 2nd derivative, 9000 cm-1 to 4000 cm-1)

The results of this study were excellent and proved the feasibility of using NIR spectroscopy to determine oil yield from the TAEP process by using NIR spectra of sesame seeds and a calibration model.  The RMSEP shown here was the lowest after using many different preprocessing algorithms and wavenumber ranges.  The transformations used can help remove unwanted variations in the raw spectra as well as reduce the effects of baseline shifts, noise, and particle size differences.  This model provides a practical method for using NIR spectroscopy to predict oil yield in sesame seeds, enabling manufacturers to choose high quality seeds when using TAEP for sesame oil manufacturing. 

Near-Infrared Spectroscopy Combined with Multivariate Calibration to Predict the Yield of Sesame Oil Produced by Traditional Aqueous Extraction Process (hindawi.com) 

Use of Near-Infrared Spectroscopy for Estimating Fatty Acid Composition in Intact Seeds of Rapeseed 

Rapeseed is one of the most important and widely produced oilseed crops used as a source of vegetable oil and as a substitute for fossil diesel fuel.  Protein and fatty acid profile are important quality parameters when rapeseeds are processed into edible oil for humans, meal for livestock and poultry feed, and industrial biodiesel.  The oil in rapeseed is approximately 40% of the seed weight and is comprised of numerous fatty acids, including unsaturated fatty acids like palmitic, stearic, oleic, linoleic, linolenic, eicosenoic, and erucic acids.  Research is being conducted to develop new genotypes for quality breeding of rapeseeds and determining the fatty acid composition in a large number of breeding lines is required for such research.  The traditional method for determining fatty acid profiles is chromatography which is time-consuming, expensive, and requires sample destruction.  There is a need for a fast, non-invasive method for determining fatty acid profiles in rapeseed and NIR spectroscopy was examined for this purpose.  Three hundred forty-nine samples of rapeseed germplasm were procured for the study.  Each sample contained about two grams of seeds. All samples were scanned using an NIR spectrometer from 400 nm to 2500 nm at 2 nm intervals.  Reference testing to determine fatty acid profiles for the samples was conducted using Gas/Liquid Chromatography, a process that took about twenty minutes for each sample and expressed individual fatty acids as a percentage of the total fatty acids.  Various preprocessing methods were applied to the spectral data before chemometric analysis.  The samples were split into two sets: two hundred and forty-nine samples for a calibration set and one hundred samples for a validation set.  Shown below are the ranges of values expressed by percentage of total fatty acids for samples, the mean values, and modeling statistics. 

SampleRangeMeanSEC
Palimitic 2.0 to 7.44.010.7950.355
Stearic 0.8 to 3.71.610.8500.193
Oleic 6.2 to 70.432.90.9802.679
Linoleic 5.7 to 26.315.90.9081.005
Linolenic 1.8 to 13.67.700.8480.626
Eicosenoic 0.6 to 29.69.000.5093.701
Erucic 0.0 to 62.928.80.9832.606

Results were excellent for oleic, linoleic, and erucic acids and independent predictions using the validation set proved the validity of these models.  The results were worse for the other four types of fatty acid, especially eicosenoic acid, and this likely occurred because the mean values of the percentage of fatty acids for these types is below 10%, making their concentration below the level of detection for NIR analysis. However, more calibration work and a larger sample set may improve the results enough to create models that can be used for screening purposes.  It must be noted that seventy of the samples contained 0% erucic acid and a more evenly distributed sample set should be used for a calibration model before this application is used in a practical setting.  Erucic acid concentration is very important because studies have shown high concentrations of it can cause heart disease in rats and the use of products containing it is strictly regulated. Mustard oil is banned from sales and imports in many countries because of a high erucic acid concentration and rapeseed oil must have an erucic acid concentration below 2%. The potential was demonstrated to use NIR spectroscopy as a fast and non-invasive method for determining oleic, linoleic and erucic acid in rapeseed seeds.  

[PDF] Use of Near-Infrared Spectroscopy for Estimating Fatty Acid Composition in Intact Seeds of Rapeseed | Semantic Scholar 

Characterisation of Castor (Ricinus communis L.) Seed Quality Using Fourier Transform Near-Infrared Spectroscopy in Combination with Multivariate Data Analysis 

Castor is a non-edible oil seed crop that is used for biodiesel production.  The oil in castor seeds accounts for 42% to 58% of the total weight and the oil is more than 90% ricinoleic acid, an acid that enables it to dissolve in alcohols at a low temperature.  This property makes castor oil advantageous over other vegetable oils for biodiesel production because less energy is required for transesterification to reduce oil viscosity.  Castor is also known for variation in maturity stages at harvest because of differences between racemes.  The final harvest can consist of seeds of different size, weight, and physiological maturity, leading to differences in quality.  Heavier seeds contain more oil and the seed weight can be positively correlated with the germination ability of the seeds.  NIR spectroscopy was examined as a method for characterizing castor seeds based on viability and oil content.  Two sample sets were procured for the study: Three hundred castor seeds from two separate ecotypes that were grown under both controlled and water-stressed conditions and twelve hundred seeds for the two ecotypes that were individually weighed and classified by weight.  The three seed weight groups were light (less than 0.1455 g), medium (0.1455 g to 0.2348 g), and heavy (greater than 0.2348 g).  Individual seeds were scanned using an FT-NIR spectrometer from 965 nm to 1701 nm. 32 cm-1 resolution was used and scans were collected at 2 nm intervals.  Sixty-four scans were collected per reading and averaged into one spectrum per seed.  After NIR spectra were collected, the seeds from the first sample set were placed on wet filter paper for germination.  After visual inspection for fourteen days, the seeds with radicle protrusion greater than 2 mm were classified as germinated.  For classification purposes, the non-viable seeds were assigned an arbitrary value of 0 and the viable/germinated seeds were assigned a value of 1.  Principle Component Analysis (PCA) was performed to analyze spectral differences between the groups. A Partial Least Squares Discriminant Analysis (PLS-DA) model was created using the NIR spectra of the first set of samples and the arbitrary numbers.  A PLS-DA model predicts the arbitrary number from the NIR spectra and uses the predicted value to classify the sample into one of the groups.  

PLS-DA99.6% Prediction Accuracy1.1% Classification Error 

The results of this study proved the feasibility of classifying viable and non-viable castor seeds using NIR spectra and a PLS-DA model.  The PCA analysis showed that the light seeds from the second sample set grouped well with the non-viable seeds from the first set while medium and large seeds grouped well with the viable seeds.  This is especially important analysis for castor seeds because of the marked variation of seeds during harvesting.  NIR spectroscopy has potential to sort castor seeds based on viability as a quality control tool to ensure that only viable seeds are used for oil extraction in a manufacturing setting.  

Agriculture | Free Full-Text | Characterisation of Castor (Ricinus communis L.) Seed Quality Using Fourier Transform Near-Infrared Spectroscopy in Combination with Multivariate Data Analysis (mdpi.com) 

Rapid Analysis of Geographical Origins and Age of Torreya grandis Seeds by NIR Spectroscopy and Pattern Recognition Methods 

Torreya is a rare cash crop tree found in southern and eastern China.  It is well-known as a potent folk medicine and contains a number of rich components that possess biological and medical activities.  Seeds are potent in proteins, fatty acids, carbohydrates, calcium, phosphorus, and iron.  They are served as a high quality nut and cakes, biscuits, and candies are made from the seed kernels.  Oil content of the seeds is between 55% and 61% of the total weight and is bright yellow with pleasant fruit flavors.  Nearly 80% of the fatty acids are unsaturated.  Because torreya seeds are such a valuable product, they are subject to adulteration through mislabeling of the province of origin and the age of the seeds.  It is known that some provinces in China grow higher quality seeds than other places and one in particular, Zhuji, maintains a Protected Geographical Indication (PGI) and is known for the extra high quality of the seeds.  Two other provinces, Anhui and Jiangxi, grow seeds that are similar in appearance but have an inferior taste and texture to Zhuji.  Seeds that are aged tend to oxidize and spoil, making any extracted oil much lower in quality.  NIR spectroscopy was examined for the purpose of discriminating Torreya seeds based on province and age.  Two hundred forty samples from the three previously mentioned provinces were procured for the study.  Samples came from different growing areas within the three provinces and some samples from one particular province were designed as old while the remainder of the samples were designated as fresh.  Seeds were scanned from 12000 cm-1 to 4000 cm-1 using an FT-NIR spectrometer.  Scanning interval was 7.714 cm-1.  Various preprocessing methods were applied to the spectral data before chemometric analysis.  Principle Component Analysis (PCA) was performed for outlier detection and to analyze differences in the spectral data.  Partial Least Squares Discriminant Analysis (PLS-DA) was performed for classification of the samples based on both geographical origin and age discrimination. 

Geographical Origin 

Sensitivity = 1Specificity = 1Classification Rate – 100% Correct

Age Discrimination 

Sensitivity = 0.939Specificity = 0.871

Perfect results were obtained for classifying the seeds based on geographical origin using NIR spectra and the PLS-DA model.  The model for age discrimination did not show results as good as the geographical origin model, but they are still considered sufficient for screening purposes.  One possible reason for this is that the aged samples only came from one province and the fresh samples also contained samples from this province, while all samples from the other two provinces were fresh.  Better distribution of aged and fresh samples from different provinces and growing areas may improve the results. The potential was demonstrated to use NIR spectroscopy as a tool for classifying Torreya seeds based on geographical origin and age. 

Rapid Analysis of Geographical Origins and Age of Torreya grandis Seeds by NIR Spectroscopy and Pattern Recognition Methods (hindawi.com) 

Near-infrared reflectance spectroscopy reveals wide variation in major components of sesame seeds from Africa and Asia 

Sesame is an important oilseed crop in both Asia and Africa because of high nutritional value and market value.  Many natural products can show marked variation in quality depending on the area of the world they are grown in, as well as in different areas of the same country.  NIR spectroscopy is one potential tool for determining variation in natural products because it is fast, non-invasive, does not require the use of toxic chemicals and solvents, and has the ability to measure multiple parameters with a single light scan. In this study, NIR spectroscopy was used to examine the variation in sesame seeds from different countries in Asia and Africa.  A total of one hundred thirty-nine samples of cultivated sesame from twenty-eight different countries (fourteen in Africa and fourteen in Asia) were procured for the study.  The samples encompassed different landraces and were all considered elite modern cultivars.  All samples were scanned using an NIR spectrometer from 1100 nm to 2500 nm at an interval of 2 nm and were scanned in duplicate.  Prebuilt calibration models for light and dark seeds were loaded into the spectrometer software.  The models predicted oil, oleic acid, linoleic acid, and protein in the samples.  For some samples, traditional reference tests were performed for the purpose of comparing the reference values to the values predicted from the NIR calibrations.  There was good agreement between predicted results from the NIR method and the traditional reference tests.  Analysis revealed interesting variation among the samples.  Light seeds displayed higher nutritional quality as they had higher values of protein, oil, and linoleic acid than dark seeds.  Samples from Africa had higher oil and linoleic acid contents, while the Asian samples had higher oleic content.  West African samples had particularly high values for nutritional components, giving them potential for increased market value.  Overall, the results of this study showed that NIR spectroscopy can be used to determine variation in sesame seeds from different areas in the world. Samples from Africa and Asia showed high variation in nutritional components.  Similarly, a study like this could be used to determine ideal sesame seed samples for breeding programs. 

Near-infrared reflectance spectroscopy reveals wide variation in major components of sesame seeds from Africa and Asia – ScienceDirect