I regularly see poor photometry in the light curves. Sometimes with a deviation of more than 0.2 magnitudes. These observers certainly achieve high observation numbers, but they are of poor quality. What does the AAVSO do with these measurements? I don’t want to offend anyone, but what is the AAVSO’s policy regarding such observations?
Very little is done about bad photometry. Despite the efforts of the Data Quality Task Force I believe the problem will persist until:
A) There is a major education effort to teach people to care about the quality of their work.
B) We establish a (volunteer) team to scrutinize photometry. [Yes, they could only examine a sample of what is submitted, but by studying the light curves of popular targets, sources of chronically dubious data will become evident.]
C) We set up sufficient training resources to help observers improve.
D) We accept the necessity of “locking out” observers who refuse to improve.
Tom
In the olde days, when we submitted observations by paper and the AAVSO staff key-punched the data into machine readable format…
the volume of observations was low, at least for critical objects they also did some level of validation at the data entry stage. Most of the CVs and the LPVs I know were reviewed well.
It has always been the duty and job of the user of the data to verify observations once they are in the database. Filtering out "bad observations and bad observers was left to the user of the data. It still is.
Usually you can tell when a one-off-observations doesn’t make sense and easily remove it from consideration. Some of observers have strange biases in their data compared to others. Again easy to filter.
Btw, Vstar, makes it very easy to do this level of filtering.
However, the massive amounts of automated observing with little or no checking going on today before uploading to the AAVSO db makes the job harder. Many observers are taking photometry in weather conidtions not even close to photometric–I know I’ve done it on a few stars!
There are standard statistical ways of filtering data points when you have enough observations. Throw out gross errors, filter statistically using a model which can be hard especially if the star is say a catacylsmic variable, but if a pulsator something like a Fourier model can be used. Or heaven forbid just filter by the human eye and perception! This should be left to the data user mostly IMHO. Gross errors if detected usually stand out like a sore thumb
.
The real solution is for observers to do data checking, get the observing and reduction process stable, transform, and not be in a rush to upload. Some are doing this I’m sure but now for the rest of us.
Jim (DEY)
Amen to that!
The two worst problems we are finding in the all-sky photometry era, when people report data for everything they can in their images, are saturation and blending.
When reporting all objects in an image, the brighter stars might be saturated and the faintest stars won’t get a good signal-to-noise ratio. Stars should be selected to avoid reporting objects that won’t meet photometric quality standards.
And images should also be visually inspected to avoid reporting stars that are blended with others. Reduced amplitudes, brighter zero points, and sometimes plain wrong identifications are not uncommon. A typical situation is reporting a bright (e.g. 11st mag.) star which is close to a faint (e.g. 17th mag.) variable as if it was the variable itself. So we get 11 mag. measures for a 17th mag. star that is actually invisible.
Cadence is another issue. If you report both a mira variable with a 360 d period and a DSCT with a period of 0.03 d, time series will be useful for the latter but won’t be useful for the former.
Paying attention to all these things will improve the overall quality of our database.
Cheers,
Sebastian
I am thankful that Sebastian has caught some of the blended stuff that i have unwittingly tried to submit. As for the saturation problem, I find that VPHOT has a nice filter for that. It gets grayed out and not reported.
To put some numerical value on it for me; 7%:
Besides carelessness, here are dozens of physical reasons that bad data can escape my observatory.
I regularly patrol my submissions with VSTAR and the biggest impediment that I am finding is that WebOBS does not always allow me to delete my bad data. There isn’t a lot of it, maybe 7%, but it is way more than HQ staff can handle. 7% may sound like a lot. It could easily be much higher, but I follow a whole litany of procedures to weed out bad stuff before it gets submitted. Then 7% may be good because I am to fussy or 7% may be bad because I am not careful enough.
Just one physical example:
I have a new camera now that just cannot replicate the quality of observations from the older camera. I think that has to do with well depth.
Always some new challenge. ![]()
Ray
Hello! I have a four stage process for evaluating my data - 1) look at the images, 2) look at the reduced error bar for each observation, 3) look at the light curve after uploading the data, and 4) ask forum members for feedback if something does not seem to fit.
- Looking at the image allows me to detect bad trailing, significant clouds, plane trails, etc., and throw away the images with obvious problems. I don’t throw away everything. For example, before I updated my mount for my 8" LX200, I had lots of images with trailed stars rather than round. I was surprised by the accuracy and precision of the reductions that I was still able to get when comparing my data with others. Also, with my narrow FOV of 12’x20’, clouds were much less a problem than for others with wide fields, since comps and targets tend to be affected equally (and the programs that let me see the images always stretched them and magnified the cloud effect!), and I figured problems with clouds that caused a gradient through the images would show up as increased error bar in any case.
2) After that, I check error range after reductions. Arbitrarily, I throw out the few reductions that have errors greater than 0.15 mags.
Why 0.15 mags? When I first started out, I had a lot of problems with remote focus between targets with my Meade 8" SCT. So, I asked if photometry with stars with donut holes was still useful when the error bar was high (higher than 0.15 mag!). Arne said yes, as long as the error bar was noted as well as any other pertinent notes about the images. Researchers could then decide for themselves whether to include the data or not. Typically, such observations were in line with what others had even with the high error bar.
Now, such problems are much less common, and I throw away images with even small donut holes since I've figured out my equipment.
3) I look at the light curve for each observation and I throw out any that don't fit. There are observations campaigns that rely on observations that don't fit, such as detecting nova, etc., but in those cases, I ask on the forum before discarding or accepting something out of the ordinary.
4) Finally, if something does not fit, forum members are great for help in identifying a problem. For example, I was getting low grade magnitude bumps on ZTF J050738.71+301649.2. (I had added this star since it was in the field of another variable that I was imaging). I could not figure out where the bump was coming from since the images looked fine, the error bar was fine, etc. It turned out, a single hot pixel was not being eliminated by the calibration software and was periodically being included in the star annulus. A forum member was kind enough to look at the images and help me sort through it. After this experience, when I then detected a magnitude spike in V0844 HER, before deleting the images, I asked the forum members if it was spiking, and it was.
Anyway, that is my 4-step work flow to try to ensure the quality of the data. With my nearly 20-year-old equipment, it seems to work well. Best regards,
Mike
I find that other useful indicators of data quality are:
SD of the Var measurements (for LPVS); SD of the Check measurements; measured check magnitude minus catalogue check magnitude.(Edit) These are useful because they represent a check on all possible sources of error.
For non-transformed measurements it is informative to compare the colour index of the var and check with the colour index of each comp. It may not matter much if filter transform coefficients are close to zero. It may matter a lot if the TC is not close to zero, as for DSLR cameras. Ensembles of comps will improve things only if the colour index of the Var is somewhere near the average CI of the comps.
My take is .2m is not necessarily bad data. Variable star science was built entirely on visual observations, until not so vary long ago, where .2 was great. That the error bar is somewhat large does not mean the data is bad or not useful. It depends on what you are trying to accomplish. In some cases, highly precise data is needed, in others, imprecise data is better than none.
I have maintained a program to follow about 170 stars every possible night for years. My error bars go up and down with conditions and the magnitude swings of the stars. I can’t possibly fine tune each night’s observations like one would if precision were the priority. I go to great lenghts to toss erronious readings but would not remove data just because the error bar crosses some threshold.
The result is I have fairly continuous curves for these stars. In most cases, without my data, all you see on the AID is an assortment of one night stand observation that don’t allow for charecterizing the star or tracking its history.
I have been able to reclassify five ro six stars using my data and suspect there are several others in my collection that should be - I just have not had the time yet. I have also discovered a couple of new variables includeing an eclipsing binary.
I process my data hundreds of observations at a time and run a simple statistical analysis comparing the relationship of the comp and reference star to spot suspect observations.
I don’t transform my data because 1) it is unnecessary to what I am doing and 2) I think it would misrepresent the quality of my data to say it is transformed. It would be like carrying the decimal to three places for two place data.
Think long-term
AAVSO does not collect data just for today’s projects - we are providing a historical record for researchers decades in the future. Is there really a reason not to do the best we reasonably can in all circumstances? I am appalled at the bad and mediocre data that people have submitted to the AID, data that obscure the good work done by others.
“I don’t transform my data because 1) it is unnecessary to what I am doing and 2) I think it would misrepresent the quality of my data to say it is transformed. It would be like carrying the decimal to three places for two place data.”
It could be argued that transforming data is unnecessary when the colour indices of the var and comp are nearly the same, or if the filter transform such as Tv_bv is near zero, or if the purpose of obtaining the data is specifically and only for timing maxima or minima of light curves, or if the data is for private use only and transformed data is for that private purpose not needed. When data is submitted to a public database, it could be argued that the observer should justify the decision not to transform the data. The AAVSO revommends transforming data.
In view of the above it would be interesting to know why it is unnecessary to transform your data.
I suspect but don’t know that the large amount of non-transformed data in the AID is simply because it is easier to obtain non-transformed data.
Concerning your second point, data is either transformed or it is not - so one can’t just ‘say’ that it is transformed. Therefore I don’t understand the following sentence about three or two significant places.
Tom, Roy, and others have made good points in this thread. The problem with the vast amounts of crummy and untransformed data is one result of the come-one-come-all policy of the AAVSO. The intention obviously is to not be off-putting to beginners, which is fine. But it also means any user of the data will have to trawl through the database to clean it up before making any analysis of it. Most folks are not going to do that.
Your data is likely to have more than one use, and so keeping all the original images, calibration frames etc in some archive is highly valuable. Even if you are using “today’s best” software for reductions, don’t think that it can’t be done better in the future or that it is all that can be done with the data. Folks also say that they are just doing differential photometry, so it doesn’t matter about transformations, zero-point etc. That is also incorrect and short-sighted, as Tom notes. The results will be far more valuable is they are transformed properly.
I thus highly recommend that one observe one or two standard fields (Landolt, southern E-regions, or other) on any night where you think it is cloud-free — even if you don’t make immediate use of those images for calibration.
Another thought here is: I wonder if it would be useful to collect from the literature (or current ‘reliable’ sources) properly calibrated data for the AID to serve as a fiducial baseline to allow comparison with the general run of incoming values.
\Brian
Because the focus of my program is to track the curve of my stars as fully as I can night after night in various photometric conditions and altitudes using a robotic system and fixed exposures the error bars are typically in the .01 -.03 range and sometime worse at the bottom of light curves. The process in inherently noisy. The change in my readings from transformation would a fraction of that. It is swallowed by the inherrent error in the data even after I toss the openly erroneous datapoints. straining at a nat while swallowing an elephant.
I will note that there are transformations and transformations. Some people think you haven’t really transformed properly unless you run the transformation each night or even for each observation. That is not possible for me to do.
I should also confess that almost all my data is V. It is possible to do one color transformation, but then the improvement achieved would likely less.
In my view, it boils down to what you are trying to accomplish. If you are setting up optimized individual multi color projects under photometric conditions with the intent of measuring something that requires the utmost accuracy, by all means transform your data.
Hello! I used to track a high and low Landolt fields every observing session when clouds allowed, but I don’t do this any more.
I’m curious if there is an advantage to resuming this.
My reasoning for stopping Landolt field imaging - with my narrow 12’x20’ field, M67 always had many more stars than the Landolt fields, so the transform value of the Landolt fields compared with M67 always had higher error bars.
Additionally, given the ensemble photometry with the narrow field, there did not seem to be an added advantage to adding Landolt fields to the image session since extinction coefficients would not be needed.
Thank you for your guidance with this. Best regards.
Mike
As a newbie, I have also wondered what good data is. I do photometry using a SeeStar S50 (50mm aperture and Sony IMX462 sensor), and ASTAP. I imaged several Landolf fields to get transformation of my data. I will average stack groups of 12-24 twenty second exposures, depending on the magnitude of the star I am imaging. For magnitude 10, a stack of 15 twenty second exposures is my “sweet spot”. I only image stars with an SNR greater than 100. I tend to focus on HADS, or variables with periods less than 6 hours due to my limited area of the sky that I can image.
As part of the ASTAP program, it calculates the standard deviation of the check star, as well as a difference between the measured check magnitude minus catalogue check magnitude. Depending on the variable, I get SD typically less than 0.015 for R, less than 0.01 for B, and less than 0.007 for G, after transformation.
Is this good data?
One thing that I have not found, other than the DSLR recommended practice manual, is documentation of what the definition of “good” data is, and the methodology to improve the quality of data.
Scott (MDSA)
Scott’s point here is one that I was thinking of as I read through this thread. On one hand, “The observer is responsible for the quality of their data.” However, for newbies there are few resources that really define “quality data”. This is something the organization should address with a vigorous training program for newbies.
Another perspective is who you are collecting the data for? If it’s for a personal project, it may be good enough for you but will it be good enough for a future researcher?
In the last several years I remember some comment that professionals don’t want AAVSO data because it’s not of good quality. Well, some of it is but I’m sure they do not want to wade through the data mixed in that is mediocre or worse.
Since part of the AAVSO’s existence depends on providing data to professionals, they would do well to put more effort into insuring data quality. I think that pronouncements about the observer being responsible their own data quality hasn’t worked. More training and REAL feedback for new observers needs to be done.
My two cents,
…Tim (HTY)
Hi Tim. I think everyone in the AAVSO (Board, Executive Director, staff, members, and observers) would agree with your statement. More training and REAL feedback is needed. LOTS of work has been going on for a long time to do more of both. A few examples of existing programs are the AAVSO mentor program, training courses, manuals, youtube videos on the AAVSOHQ channel, forums where questions can be asked, and automation to visualize your data against others’ as you submit it.
In the end, the small size of the AAVSO staff limits what they can do to directly help observers. We definitely need more volunteer help with this. One of the things I do when I see suspect data on a light curve is reach out in a friendly way to the observer and discuss it with them. The response has almost always been excellent. Most people do want to provide great data and want to know what they can do to improve.
Clearest skies,
Walt
Hi Walt, and thanks for your quick reply. Those resources are great but they all require initiation by the observer. We have on staff according to the AAVSO web page, an executive director, an operations manager, a staff astronomer, an external consultant (who does an excellent job by the way), a software developer and a volunteer coordinator.
May I suggest it might be really useful to appoint/hire a quality control manager. If it needs to be a volunteer, well as dedicated and outstanding as a volunteer may be, it will never be their top priority. However, if it needs to be a volunteer, I submit that will be better than the current situation. That person can look at incoming data (use AI here to assist if you have to) and proactively contact observers when things seem to be “going south” or not up to the standard that professionals may need to use the data. That person can assign a mentor if need be.
I realize that as observing gets more and more automated, the avalanche of data arriving to our servers gets larger every day/month. That’s why I suggested AI examination of the incoming data. Let’s not make perfection the enemy of the good. Apply the 80/20 principle and go after the 20 percent who submit 80 percent of the mediocre data. Proactively encourage best practices without them having to search for it. Show them how to transform PERSONALLY. Don’t just point them to resources that they may or, in the case of beginners, probably don’t understand.
Okay, I’ve ranted enough. I have more thoughts but this is long enough.
Thanks again for your quick reply Walt.
Cloudy her tonight so I can CAREFULLY reduce my data. ![]()
…Tim (HTY)
Sorry to hear it is cloudy. Here too. The AAVSO is doing some hiring this year. An educator and a volunteer coordinator have been the priorities for those positions. Both areas of need. The educator role will include updating and improving training materials and opportunities. One of the ideas is more short, on-demand types of training videos/modules since written manuals can be daunting and not necessarily the best format for a lot of people to learn. The volunteer coordinator will help improve coordination of the huge number of volunteers that help in lots of roles.
The Data Quality Task Force spent a lot of time looking at automated ways to identify bad data. That turned out to be a very difficult thing to figure out. The most straightforward ways don’t work very well in many cases unfortunately. Variable stars do too many things in too many different ways for simple statistical methods to be very robust. If the systems aren’t very robust, it means people get notifications of problems with their data that aren’t problems with their data and that’s not a good thing either.
The Data Quality Task Force’s work was pre-AI boom. Maybe that changes things. I’m not sure. I’m still afraid of sending emails to people saying we think their data is bad if it’s really the AI algorithm that’s bad. If AI is right 80% of the time, that’s still a lot of emails telling people the wrong thing.
Your point about personal contact with people to help them learn and develop their skills is dead on. The more people helping with that the better. We all had to learn and we all have made mistakes in the process. I think organizing and expanding that effort is something the new volunteer coordinator and educator will really be able to help with.
-Walt
Would anyone consider volunteering for a quality control role? That would be a great help for the AAVSO and for observers, especially new ones. My email is wcooney@aavso.org if you want to discuss it some.
-Walt
I notice that my remark is causing quite a stir. It wasn’t actually about the new observers, but when I look at light curves online, I sometimes see a lot of points in a curve with a great deal of spread, and these were logged by experienced photometric observers. Sometimes it is only a small part of a curve, which makes one realize that this serves no other purpose than to inflate the number of observations so that people can say that you, as an observer, have made many observations. But unfortunately, the quality of those observations is very poor.