We just ran the trove sync script and have dupes now. Scala is just one example:
<option value="704">Scala</option> <option value="704">Scala</option> { "_id" : ObjectId("4e833498bfc09e796b000040"), "trove_parent_id" : 160, "trove_cat_id" : 704, "shortname" : "scala", "fullname" : "Scala", "fullpath" : "Programming Language :: Scala" } { "_id" : ObjectId("4e947427bfc09e1fd4000040"), "trove_parent_id" : 160, "trove_cat_id" : 704, "shortname" : "scala", "fullname" : "Scala", "fullpath" : "Programming Language :: Scala" }
maybe we could just add a unique index on trove_cat_id and make the dupes get dropped
This can be fixed by opening up a mongo console and running these commands:
use pyforge
var previous_id=0;
db.trove_category.find().sort({'trove_cat_id':1}).forEach( function(current) {
if(current.trove_cat_id == previous_id){
db.trove_category.remove({'_id':current._id});
}
else{
previous_id = current.trove_cat_id;
}
});
To test this, I ran the trove sync script a few times to make dupes, ran the above commands, then confirmed only one of each category was left. Once this passes review, a sog ticket will be needed to ask them to run this on production.
Works fine on my sandbox, prod ticket: https://control.siteops.geek.net/sog/trac/ticket/19214